Database architecture for multilingual content

Recently I had a discussion at work about database architecture supporting multilingual content again. This is a question that is been asked for so long – How to design the database to enable multilingual content ? Ok here the main goals are two:

  • performance
  • easy management – adding new fields, deleting existing fields and so on

As I see it, there are three approaches that deserve our attention:

  • the most common, mostly used in the past (according to me) – single table holding all the fields: common fields + language fields
  • the multilanguage tables – each language has own table that complement the data in the common table
  • single multilanguage table complementing the common table – the multilingual table has a primary key on id from the common table and the language

So I decided to investigate what is the best solution to use. Here is the database structure I used:

  • the first way:

CREATE TABLE IF NOT EXISTS `simplest` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`title_en` varchar(255) NOT NULL DEFAULT ‘ ‘,
`title_bg` varchar(255) NOT NULL DEFAULT ‘ ‘,
`body_en` text NOT NULL,
`body_bg` text NOT NULL,
`sum_en` int(11) NOT NULL DEFAULT ’0′,
`sum_bg` int(11) NOT NULL DEFAULT ’0′,
PRIMARY KEY (`id`)
) ENGINE=MyISAM  DEFAULT CHARSET=utf8

  • the second way

CREATE TABLE IF NOT EXISTS `lang_tbls` (
`id` int(20) NOT NULL AUTO_INCREMENT,
PRIMARY KEY (`id`)
) ENGINE=MyISAM

CREATE TABLE IF NOT EXISTS `lang_tbls_bg` (
`id` int(11) NOT NULL DEFAULT ’0′,
`title` varchar(255) NOT NULL DEFAULT ‘ ‘,
`body` text NOT NULL,
`sum` int(11) NOT NULL DEFAULT ’0′,
KEY `id` (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

CREATE TABLE IF NOT EXISTS `lang_tbls_en` (
`id` int(11) NOT NULL DEFAULT ’0′,
`title` varchar(255) NOT NULL DEFAULT ‘ ‘,
`body` text NOT NULL,
`sum` int(11) NOT NULL DEFAULT ’0′,
KEY `id` (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

  • and the third option

CREATE TABLE IF NOT EXISTS `one_tbl` (
`id` int(20) NOT NULL AUTO_INCREMENT,
PRIMARY KEY (`id`)
) ENGINE=MyISAM

CREATE TABLE IF NOT EXISTS `one_tbl_lang` (
`id` int(11) NOT NULL DEFAULT ’0′,
`lang` char(2) NOT NULL DEFAULT ‘en’,
`title` varchar(255) NOT NULL DEFAULT ‘ ‘,
`body` text NOT NULL,
`sum` int(11) NOT NULL DEFAULT ’0′,
PRIMARY KEY (`id`,`lang`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

Now as we know what are the options in details let us first see what is the best structure to be easy to maintain and extend. As you can suppose this is the last one. When needed to add a new lang … nothing to be done in the database. When needed to add a new field …. just alter one table. Between the first two choices better is the second just because the languages are separated and it is easier to manage, but having more tables of course.

Ok now the real deal – performace. I have created the structure above on my localhost mysql database. Then made a script and imported data of 5000 rows and then 10000 rows. Ran a few queries that according to me are mostly used and important to make the difference. And these are the results:

0.0018 sec/5000 rec, 0.0018 sec/10000 rec,
SELECT * FROM `simplest` WHERE 1 LIMIT 20 OFFSET 1000
0.0064 sec/5000 rec, 0.0152 sec/10000 rec,
SELECT * FROM `simplest` WHERE title_en LIKE ‘%ass%’ LIMIT 20
0.0090 sec/5000 rec, 0.0190 sec/10000 rec,
SELECT * FROM `simplest` WHERE title_en LIKE ‘%ass%’ OR title_bg LIKE ‘%ass%’ LIMIT 20

0.0030 sec/5000 rec, 0.0037 sec/10000 rec
SELECT * FROM `lang_tbls` t INNER JOIN `lang_tbls_en` l ON (t.id = l.id) WHERE 1 LIMIT 20 OFFSET 1000
0.0057 sec/5000 rec, 0.0060 sec/10000 rec
SELECT * FROM `lang_tbls` t INNER JOIN `lang_tbls_en` l ON (t.id = l.id) WHERE l.title LIKE ‘%ass%’ LIMIT 10
0.0116 sec/5000 rec, 0.0120 sec/10000 rec
SELECT * FROM `lang_tbls` t INNER JOIN `lang_tbls_en` l1 ON (t.id = l1.id) INNER JOIN `lang_tbls_bg` l2 ON (t.id = l2.id) WHERE 1 LIMIT 20 OFFSET 1000
0.0292 sec/5000 rec, 0.0325 sec/10000 rec
SELECT * FROM `lang_tbls` t INNER JOIN `lang_tbls_en` l1 ON (t.id = l1.id) INNER JOIN `lang_tbls_bg` l2 ON (t.id = l2.id) WHERE l1.title LIKE ‘%ass%’ OR l2.title LIKE ‘%ass%’ LIMIT 10
0.0157 sec/5000 rec, 0.0175 sec/10000 rec
SELECT * FROM `lang_tbls` t LEFT JOIN `lang_tbls_en` l1 ON (t.id = l1.id) LEFT JOIN `lang_tbls_bg` l2 ON (t.id = l2.id) WHERE 1 LIMIT 20 OFFSET 1000
0.0405 sec/5000 rec, 0.0445 sec/10000 rec
SELECT * FROM `lang_tbls` t LEFT JOIN `lang_tbls_en` l1 ON (t.id = l1.id) LEFT JOIN `lang_tbls_bg` l2 ON (t.id = l2.id) WHERE l1.title LIKE ‘%ass%’ OR l2.title LIKE ‘%ass%’ LIMIT 10

0.0098 sec/5000 rec, 0.0099 sec/10000 rec
SELECT * FROM `one_tbl` t INNER JOIN `one_tbl_lang` l ON (t.id = l.id AND l.lang = ‘en’) WHERE 1 LIMIT 20 OFFSET 1000
0.0430 sec/5000 rec, 0.0470 sec/10000 rec
SELECT * FROM `one_tbl` t INNER JOIN `one_tbl_lang` l ON (t.id = l.id AND l.lang = ‘en’) WHERE l.title LIKE ‘%ass%’ LIMIT 10
0.0025 sec/5000 rec, 0.0030 sec/10000 rec
SELECT * FROM `one_tbl` t INNER JOIN `one_tbl_lang` l ON (t.id = l.id AND (l.lang = ‘en’ OR l.lang = ‘bg’)) WHERE 1 LIMIT 20 OFFSET 1000
0.0060 sec/5000 rec, 0.0067 sec/5000 rec
SELECT * FROM `one_tbl` t INNER JOIN `one_tbl_lang` l ON (t.id = l.id AND (l.lang = ‘en’ OR l.lang = ‘bg’)) WHERE l.title LIKE ‘%ass%’ LIMIT 10
0.0170 sec/5000 rec, 0.0184 sec/5000 rec
SELECT * FROM `one_tbl` t LEFT JOIN `one_tbl_lang` l1 ON (t.id = l1.id AND l1.lang = ‘en’) LEFT JOIN `one_tbl_lang` l2 ON (t.id = l2.id AND l2.lang = ‘bg’) WHERE 1 LIMIT 20 OFFSET 1000
0.0457 sec/5000 rec, 0.0485 sec/10000 rec
SELECT * FROM `one_tbl` t LEFT JOIN `one_tbl_lang` l1 ON (t.id = l1.id AND l1.lang = ‘en’) LEFT JOIN `one_tbl_lang` l2 ON (t.id = l2.id AND l2.lang = ‘bg’) WHERE l1.title LIKE ‘%ass%’ OR l2.title LIKE ‘%ass%’ LIMIT 10

I think that it is really clear the the first choice is best in terms of performance. The queries I have used for the latter options are more because there is difference in what exactly you want to get from the database using different types of joins, which one can do with no joins in the first option. So if you want to emphasis on performance that is for you!

Ok let’s move on – the second and the third. As you can see the third option is better just in queries 3 and 4 – quite better I would say. But according to me these queries are queries that are rarely used. The can be used for searches in the databases for both languages which is really rare wanted feature, really. In all other queries the second option is better. So if you want to emphasis on maintenance and easy development – option 3 is for you.

But if you want to be between the two requirements – use the second option.

This post is also available in: Bulgarian

  1. Through pace together with the technical issues, the watch is wonderfully made, with blended finishing and the majority fine details on the pushers, to typically the tachymeter scale around the DLC viewed bezel. On wrist, the enjoy is large but the country’s size seems to be fitting on a watch with very much bravado. Part of that which was new for ones watch was a very simple and also elegant lead to. Mounted to the left with the case it again never jabs within your adjustable rate mortgage or wrist, and concluded on looking cool. I found not wearing running shoes also acted like a man-magnet as plenty of guys would discuss watch together with express how much they liked it. The lead to is stream-lined, built with a solid section of cut carbon. The as well as texture appears to be like great, and as a chronograph pusher and also work perfectly. Above it is a smaller recast pusher for that chronograph.

  2. As I’ve noted many times before, iOS updates are not supposed to bring battery life problems to iPhones or iPads. Typically, bad battery life is caused by third-party apps or users themselves.

  3. Airborne dirt and dust over the watch condition and pendant and perspiration might lead to skin reactions. Long expression erosion can lead to oxidation involving steel. To keep away from this prospective threat, can use toothbrush dropped in detergent and water regularly fresh the see case and band, then implement soft and never strong the water rinse, dry out preoccupy absorbent pad. Equipped through leather bracelet watches furthermore should adopt these principles, but not likely too stormy strap. If typically the leather wrist strap is splashed rainy, dry by using cloth liquid quickly. The wrist watches launch this series flagship cellular, a constrained edition discharge, finely, absolutely exceptional technology may make the glimmering and translucent relieve watchcase found perfect vision aesthetic feeling, make the actual human, the specifics more demonstrate elegant and outstanding superior quality.

  4. In Android Central’s test they tell us that Moto Voice of 202 Moto X responds almost instantly while Samsung Galaxy S6’s S Voice can take up to three seconds to ask for input.

  5. Apple’s software ecosystem is incredibly complicated, and changing one feature of an operating system often has ancillary or unanticipated consequences.

  1. No trackbacks yet.