Turbotodd

Ruminations on IT, the digital media, and some golf thrown in for good measure.

Digital Diet

with one comment

We’re less than 24 hours away from financial Armageddon.  I’ve been stocking up on water and non-perishables in my garage, just in case.

No no, no tin foil helmet radio for me.  Justttt kiddingg.

I’m confident our politicians are going to reach some fiscal sanity, and my understanding of the process is that the Senate is about to vote on the bill passed last evening by the House, so my fingers are crossed.

But let it be known that the U.S. Congress isn’t the only legislative body that’s busy attending to the peoples’ business.

In Japan, IBM announced yesterday that it’s helping the National Diet Library of Japan, the country’s only national library, to digitize its literary artifacts on a massive scale to make them widely available and searchable online (The Diet is the legislative body in Japan).

The prototype technology enabling the system was built by IBM Research and allows full-text digitization of Japanese literature to be quickly realized through expansive recognition of Japanese characters and enabling users to collaboratively review and correct language characters, script and structure.

The system is also designed to promote future international collaborations and standardization of libraries around the world.

“Nearly two decades ago in his book Digital Library, Dr. Makoto Nagao, the director of the National Diet Library, shared his vision that digitized and structured electronic books will dramatically change the role of libraries and the way knowledge will be shared and reused in our society,” said Dr. Hironobu Takagi, who led the development of the prototype technology at IBM Research – Tokyo.

“Until now, the breadth of the characters and expressions within the Japanese language had posed a series of challenges to massive digitization. In order to enable this transfer of knowledge from print to online, we realized the need for both machine and human intelligence to understand information in every form.”

Compared to other languages, which rely on just a few dozen alphabetical characters, Japanese is extremely diverse in terms of script. In addition to syllabary characters, hiragana and katakana, Japanese includes about 10,000 kanji characters (including old characters, variants and 2,136 commonly used characters), in addition to ruby (a small Japanese syllabary character reading aid printed right next to a kanji) and mixed vertical and horizontal texts.

Aside from ensuring quality recognition of Japanese characters, IBM researchers aimed to optimize the amount of time needed to review and verify the accuracy of the digitized texts. By introducing unique collaborative tools via crowdsourcing, the technology allows many users to quickly pour through the texts and make corrections at a much higher rate of productivity and efficiency.

Written by turbotodd

August 2, 2011 at 3:54 pm

One Response

Subscribe to comments with RSS.

  1. If u really want to know more about Diet I really suggest you viewers to take a look a this.
    >>>>>>>>>>http://tinyurl.com/3hecy5p

    Rodney

    August 10, 2011 at 10:35 pm


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: