Web Hosting Forum | Lunarpages
News: July 14, 2008 - New Contest! - Submit Your WordPress Theme Designs, Win BIG!
June 30, 2008 - Submit Your Site for the July 08 Site of the Month Award!
 
*
Welcome, Guest. Please login or register.
Did you miss your activation email?
July 26, 2008, 05:03:39 AM


Login with username, password and session length


Pages: [1]   Go Down
  Print  
Author Topic: Importing Wordpress Containing Different Languages  (Read 429 times)
EazyVG
Trekkie
**
Offline Offline

Posts: 16


« on: May 15, 2008, 06:30:24 AM »

Hello there once again, as I am in the process of setting up, or rather moving blogs, I have encountered another problem.

Status: I am in the process of moving my blogs from wordpress.com to wordpress.org hosted on Lunarpages.
Problem: The original blog has posts, containing several languages/characters, specifically Russian and Hindi (Indian) apart from English, and when I import the file using WP Import tool (exported using the WP built-in export tool), I get "Huh??" instead of letters.

Ex: the following link has Russian and Hindi showing properly - http://eazyvg.linuxoss.com/about/
Once imported to new blog, it is not showm - http://test.vicharbhatt.com/?page_id=13

What needs to be set in MySQL server or Wordpress to import all posts properly, with characters showing properly?
Logged
MrPhil
Quantum Encyclopedia Writer
*****
Offline Offline

Posts: 3108



« Reply #1 on: May 15, 2008, 07:19:38 AM »

Do you have the same character set (probably UTF-8) in the new database as in the old? Are the pages displayed in the same character set?  While you're at it, make sure you have the same collation order and any other character settings. Look at the .sql file (or whatever the dump output is) and see if the non-Latin text looks OK there (maybe there's a switch you have to set while dumping to tell WP how to handle non-Latin text).
Logged

EazyVG
Trekkie
**
Offline Offline

Posts: 16


« Reply #2 on: May 16, 2008, 06:53:30 AM »

Ok, for one, under Wordpress Settings->Reading->Encoding for Pages and Feeds->UTF-8, in both cases, from the exporting wordpress.com blog, as well as in the testing wordpress.org blog, i.e. test.vicharbhatt.com.

As you can see from the XML file generated when exporting, English, Russian and Hindi fonts/characters are properly decoded:
(a small piece from file)
Quote
<title>1. About Me</title>
<link>http://eazyvg.linuxoss.com/about/</link>
<pubDate>Mon, 26 Sep 2005 06:25:47 +0000</pubDate>
<dc:creator>eazyvg</dc:creator>
<category>Uncategorized</category>
<category domain="category" nicename="uncategorized">Uncategorized</category>
<guid isPermaLink="false">/about/</guid>
<description/>

<content:encoded>
<img src="http://img146.imageshack.us/img146/7342/vicharbdd6.jpg" alt="" width="116" height="150" align="right" />Hi.

My name is Vichar Bhatt,

(<em>russian:</em> Вичар Бхатт),
<p class="western" style="margin-bottom:0;"><span style="font-family:Lucidasans;">(<em>hindi</em>: िवचार भट),</span></p>
a.k.a E@zyVG™ on the net,

friends simply call me VG
................................

With the rest of the posts it is also OK in the XML file, Russian and Hindi.

Once this imported to test.vicharbhatt.com and then again exported to XML, there we see the HuhHuh:
(same piece)
Quote
<title>1. About Me</title>
<link>http://test.vicharbhatt.com/?page_id=13</link>
<pubDate>Mon, 26 Sep 2005 06:25:47 +0000</pubDate>
<dc:creator>eazyvg</dc:creator>
<category>Uncategorized</category>
<category domain="category" nicename="uncategorized">Uncategorized</category>
<guid isPermaLink="false">/about/</guid>
<description/>

<content:encoded>
<img src="http://img146.imageshack.us/img146/7342/vicharbdd6.jpg" alt="" width="116" height="150" align="right" />Hi.

My name is Vichar Bhatt,

(<em>russian:</em> Huh?? Huh??),
<p class="western" style="margin-bottom:0;"><span style="font-family:Lucidasans;">(<em>hindi</em>: Huh?? ??),</span></p>
a.k.a E@zyVG™ on the net,

friends simply call me VG
...............................

Notice the bold-underlined words for comparison.

As I am exporting from a blog that is hosted on Wordpress.COM, I cannot and do not have option to export in .sql format.

What should be next step to try to resolve the issue?
Logged
EazyVG
Trekkie
**
Offline Offline

Posts: 16


« Reply #3 on: May 16, 2008, 06:57:13 AM »

In mySQL db, via phpMyAdmin I see the following for test.vicharbhatt.com:

MySQL charset:  UTF-8 Unicode (utf8)
MySQL connection collation: utf_unicode_ci
Logged
EazyVG
Trekkie
**
Offline Offline

Posts: 16


« Reply #4 on: May 16, 2008, 07:49:46 AM »

BTW, if I do a new post, now on test.vicharbhatt.com, in Russian language, I again get question marks instead on letters.

Seems like the problem is not in the exported XML file by wordpress, but rather in settings of that in SQL. Perhaps I need to change
something there prior to importing.
« Last Edit: May 16, 2008, 10:40:33 AM by EazyVG » Logged
EazyVG
Trekkie
**
Offline Offline

Posts: 16


« Reply #5 on: May 18, 2008, 06:06:26 AM »

Please assist to resolve this issue, as I cannot move forward without resolving it.
Logged
MrPhil
Quantum Encyclopedia Writer
*****
Offline Offline

Posts: 3108



« Reply #6 on: May 18, 2008, 09:19:03 AM »

If you compare the resulting pages for the old and new sites, you will see that the code is different. Note the different <html> and <head> tags. Make them consistent and see if there are any differences now.

Both pages are UTF-8 character set. Have you confirmed that the new site's database was set to UTF-8 before importing the data? After importing the data, have you looked at some sample rows in the tables to see if the data looks correct?
Logged

EazyVG
Trekkie
**
Offline Offline

Posts: 16


« Reply #7 on: May 20, 2008, 05:16:09 AM »

MrPhil,

I tried few things, from 0, but still no go. I am very newbie at this.

Can you please explain me what to do step-by-step:
1. Export from wordpress.com blog
2. Create new wordpress.org blog, using fantastico
3. Change something in MySQL before importing and so on ..........

I'll be very thankfull to you.
Logged
MrPhil
Quantum Encyclopedia Writer
*****
Offline Offline

Posts: 3108



« Reply #8 on: May 20, 2008, 08:32:37 PM »

I'm sorry, but I'm not familiar enough with Wordpress to give you step by step instructions. What I've covered are things easily seen from "the outside" and common problems for all sorts of applications. If your installation process creates the MySQL database for you (or you create it manually), you can go into phpMyAdmin and see what the character set and collation order are. If they don't match what was on the old (working) system, you'll have to change them before importing the data. I'd check them again after importing, and see if anything changed. For anything beyond that, I'll have to gracefully bow out and let someone familiar with Wordpress guide you...
Logged

EazyVG
Trekkie
**
Offline Offline

Posts: 16


« Reply #9 on: May 21, 2008, 01:38:52 AM »

Here is what I have found from wordpress.org forum on this issue:

Comment out the two lines:
Code:
define('DB_CHARSET', 'utf8');
define('DB_COLLATE', '');

Put // in front of both lines:
Code:
//define('DB_CHARSET', 'utf8');
//define('DB_COLLATE', '');

This DOES RESOLVE the issue of wordpress and different language, apart from English.

What I do not understand is why when I create a new wordpress blog, even though that MySQL is set to
MySQL charset:  UTF-8 Unicode (utf8)
MySQL connection collation: utf_unicode_ci
as seeing from phpMyAdmin, once the database for wordpress is automatically created using the Fantastico script, the tables are
assigned collation with latin1_swedish_ce!!!!!!!!! Why not UTF-8?Huh?

I guess it is better to change them all manually to utf_unicode_ci ...... yes, or it does not matter?

For now, use the above mentioned fix to resolve the issue.

Maybe we should add somewhere in the upgrade instructions that, seemingly, automatic Fantastico upgrade overrides the old config file (adding the charset and collate lines) which makes the non-English blogs to "go boom"...


« Last Edit: May 21, 2008, 08:55:19 AM by EazyVG » Logged
EazyVG
Trekkie
**
Offline Offline

Posts: 16


« Reply #10 on: May 21, 2008, 07:39:15 AM »

http://eazyvg.linuxoss.com/2008/05/21/fixing-wordpress-and-mysql-charset-problem-especially-when-importing-from-one-blog-to-another/

hope that helps
« Last Edit: May 21, 2008, 08:55:36 AM by EazyVG » Logged
EazyVG
Trekkie
**
Offline Offline

Posts: 16


« Reply #11 on: May 21, 2008, 12:56:45 PM »

TO HAVE ALL POSTS IMPORTED WITH utf8_general_ci:

Set collation as utf8_general_ci, go to phpMyAdmin and DROP all tables, as mentioned in Troubleshooting section.

After that head to your domain where blog is installed, go through couple install steps and then import the file that was exported using
wordpress.com blog. DO NOT CHANGE wp_config.php file.

ALL WORKING NOW.

FANTASTICO - I am quite sure after experimenting that Fantastico script has some problems with the charset setting.
« Last Edit: May 21, 2008, 01:13:33 PM by EazyVG » Logged
MrPhil
Quantum Encyclopedia Writer
*****
Offline Offline

Posts: 3108



« Reply #12 on: May 21, 2008, 05:17:51 PM »

Glad to hear you're making progress. Why latin1_swedish_ce (is that _ci)? Vy noot? (Little joke there.) MySQL AB is based in Sweden, so they choose to make the local language settings the default. It's no more chauvinistic, I suppose, than a US developer making American English (en_US) the default. Nothing to worry about, but it could trip up the unwary.

Fantastico must be used with care if you're not on an English language, vanilla setup. Its defaults work for most basic installations, but for anything outside of that, beware. If you have an application that's expecting non-English characters, or anything else out of the ordinary, you'll have to go over the results from Fantastico with a fine-toothed comb and manually fix settings.
Logged

Pages: [1]   Go Up
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.3 | SMF © 2006-2007, Simple Machines LLC
Seo4Smf v0.2 © Webmaster's Talks


Valid XHTML 1.0! Valid CSS! Dilber MC Theme by HarzeM