Show HN: I trained a neural network to write Kanji

zhte415 · on June 23, 2018

Edit: Reading some comments here - this doesn't seem to be about 'recognising' or 'predicting' existing characters, but using a dataset of characters to create a character by itself (which probably isn't an existing character).

This is quite 'clever'.

I don't understand the example shapes at the beginning. They're not correct strokes. How does that work?

The about page has some neat made up characters.

But after trying a few strokes, and doing so more carefully, it seems if you put in a clear radical, the character is well formed, kinda; if you put in a squiggle, all you get is a doodle... that makes sense.

Inputting 口 or 艹 for example, vs a random squiggle. Take care to make it reasonable accurate.

About characters, incase anyone doesn't know: A character is basically a 2x2 grid where 4 radicals get placed (there are (about) 201 radicals in modern Chinese, Japanese kanji too I guess?). Sometimes 'cells get merged' so the left column of 2 rows is merged to contain 1 radical, and the right contains 1 or 2 radicals. Or 'add a row' can happen at the top, for example adding a 艹 above the 2x2.

jfries · on June 23, 2018

I don't think it's helpful to think of the structure as being based on a 2x2 grid. A bit down on this page (random Google hit) are ways to structure a character: http://www.guavarama.com/2015/03/07/learning-about-character...

Note that this structure can also apply recursively, further subdividing an area.

zhte415 · on June 23, 2018

I think it is quite useful.

A character is a combination of radicals, of which there aren't that many.

You're correct, obviously there are more than sticking a radical in a 2x2 grid, however the square framework is always there and emphasises symmetery.

Pamar · on June 23, 2018

"a character is basically a 2x2 grid"... Do you have any reference for this type of classification/composition?

N.B.: I am not implying that it is incorrect - I am a dabbler in Japanese Brush Calligraphy (without any fluency in the language, only as an art) and I would like to read more about this so if you have links or books (preferably in English) I would be very happy to learn more.

jcl · on June 23, 2018

The practice paper given to children learning kanji is commonly partitioned into a 2x2 grid, to help with proportions -- kind of like how children learning to write English are given paper with guidelines for baseline, ascenders, descenders, and mean line. You can see an example of the 2x2 grid in the Nintendo kanji game on the article's "Info" page, or by searching for "kanji practice paper".

The lines are meant more to help relative placement than to exactly divide characters, but many characters do divide into left-and-right or top-and-bottom portions -- see Jack Halpern's SKIP system for example:

https://en.wikipedia.org/wiki/Kodansha_Kanji_Learner%27s_Dic...

zhte415 · on June 23, 2018

Characters are written, not drawn. First learn to write.

peterburkimsher · on June 23, 2018

I just finished making a Chinese traditional characters dataset, with 15 million items, 52,000 characters.

It's "only" 9.3 GB compressed; 13.47 GB uncompressed.

https://drive.google.com/open?id=18I-5wU54CG1lty0udBOpptAG49...

Soon I'll write an article explaining how I made it, and then try experimenting with TensorFlow.

sova · on June 22, 2018

Fake* Kanji! Some of them had me questioning my knowledge.

hardmaru · on June 22, 2018

This was related to some work I did a few years ago but recently had time to retrain models to make it work inside the browser in an interactive setting.

The dataset used to train the network had to be refined a bit as well to match how humans write on a tablet.

Some prev discussion a while back for the original non-interactive TensorFlow version:

https://news.ycombinator.com/item?id=10801712

meifun · on June 22, 2018

I was trying something like this for Chinese characters to help my learning. Could I contact you to ask a few questions?

hardmaru · on June 22, 2018

Sure, Jason.

Btw this is the dataset used for training. It is a part of a larger open source project used for educational purposes, that might help you learn:

http://kanjivg.tagaini.net

peterburkimsher · on June 23, 2018

Hi Jason! Please get in touch with me too; I have a lot of Chinese data to share and discuss.

I made https://pingtype.github.io - a program to break up sentences into words, pinyin and parallel translation, and typing characters by breaking them into glyphs.

I also just finished making a large dataset of glyph images of 52,000 characters from 1200 fonts - see my other comment for the download link.

meifun · on June 24, 2018

Thanks for reaching out! I will reply. I have been learning for years and I love it. I want others to love it as well.

SuperNinKenDo · on June 23, 2018

Did not work well at all for me, and now I'm left wondering if it's the neural networks fault, or if I'm just very bad at writing kanji now.

kazinator · on June 23, 2018

Hit or miss for me. It doesn't want to fill in anything if I draw the enclosure of "wind": 風. Also fails to find some of the simplest kanji; completions for just a stroke or two are quite complex. E.g. a simple downstroke suggests mouth 口, with just two more strokes, or 土 and whatnot, but .. 出てきてくれない. :) It is also stumped if I draw the first two strokes of 山; it wasn't trained on this character, to complete that middle stroke, and basically doesn't want to do anything else, either.

Some out-of-order inference would be cool. E.g. draw the bottom four dots (fire) of 煎, and have various top parts emerge. For that purpose, it would be good if there were a reference frame. That is to say: an underlying square box to serve as a target for the supplied input. If you draw something near the bottom of the empty box, then it's understood by the neural network to be a bottom part of the kanji requiring a top. I think the whole concept could really benefit from a precise agreement between the user and the neural net about the bounding box.

prewett · on June 23, 2018

I don't think it's your kanji writing; I can't figure out what it's supposed to do. I'm assuming it's supposed to autocomplete my kanji, but it failed at that, for both 国, 本, and 三. I'm pretty confident I wrote them right, especially the last one :)

It seems to just write random kanji based on the last stroke or something. But, writing random kanji has a certain coolness factor. Especially if I could copy-paste a kanji and get it to write it for me so that I know what the stroke order is supposed to be.

mbeissinger · on June 22, 2018

Great work as always from hardmaru :)

sbierwagen · on June 23, 2018

Neat, it's like if A Book From The Sky was implemented as a neural network.

https://en.wikipedia.org/wiki/A_Book_from_the_Sky

SerLava · on June 22, 2018

This is great! It gives off the feeling of a bizarre kind of intelligence.

Can you please input things like 感覚的 and 意識 and 私は部屋 and 自殺 and 何私 to freak some people out?

peterburkimsher · on June 23, 2018

Please add the date to the title: (2015)

wei_jok · on June 23, 2018

I think this demo was created recently, and the old article linked in the demo is there only for background info, as the author explained in one of the comments below.

TensorFlow.js only came out this year and the interactive sketch-rnn JavaScript browser demo that this was based off of is also quite recent.

vat · on June 22, 2018

this website uses non-free javascript

hardmaru · on June 23, 2018

Interesting

I think the reason that you believe the JS code is obfuscated is because the part of the code that contains the “weights” of the neural network, which contains 4-5 million floating point numbers of an LSTM recurrent neural network.

In fact I trained the neural network using the open source version of Sketch-RNN and encoded the weights using base64 to save you some bandwidth (https://github.com/tensorflow/magenta/tree/master/magenta/mo...)

Welcome to “Software 2.0”, I guess!

SuperNinKenDo · on June 23, 2018

Thanks for bringing this to my attention, although a link to the Stallman article which someone linked below would have been helpful. I've installed LibreJS, I'm curious to see how many things stop working now.

gitgud · on June 23, 2018

It's a software firm, which do machine learning projects...

Aren't they allowed to obfuscate they're IP?

baylearn · on June 22, 2018

How so?

mappu · on June 23, 2018

Even in the FLOSS world this is a somewhat extremist position - see https://www.gnu.org/philosophy/javascript-trap.en.html

SuperNinKenDo · on June 23, 2018

Interesting, I'm inclined to support some way of removing code obfuscation. I wonder if a neural network could be trained to do that.

mappu · on June 23, 2018

It can be done reasonably well without a neural network (look at the '{}' button in Chrome, or Hex-rays for C/C++).

There is some work in applying an NN to make the result look more like "realistic" code samples: http://www.cs.unm.edu/~eschulte/data/katz-saner-2018-preprin...