Foldable Words

bobbiechen · on Feb 12, 2021

Nutrimatic is a neat tool (based on the trie datastructure) that can help you find words or phrases things like this - the query i?t?s?a?p?l?e?a?s?u?r?e?t?o?s?e?r?v?e?y?o?u?&.{10,} finds subsequences of at least length 10: https://nutrimatic.org/?q=i%3Ft%3Fs%3Fa%3Fp%3Fl%3Fe%3Fa%3Fs%...

Since it uses Wikipedia as the text source and orders by frequency, the results tend to be more realistic phrases than you might find doing smashing words from a word list together.

twic · on Feb 12, 2021

Over christmas we had a look at the GCHQ christmas quiz [1]. The idea is that you solve a set of small puzzles, each producing a letter, which then become labels of the nodes of a directed graph, and you have to find a Eulerian path through the graph (visiting some nodes multiple times) which spells out a festive word or phrase.

We solved two or three of the small puzzles, but couldn't get the rest. I wondered if we could solve the overall puzzle without knowing the letters, by exploiting the structure of the graph: perhaps there are simply not many words or phases which could be produced by the graph.

Enumerating the possible paths through the graph was straightforward (there are a few hundred, i think). Matching those paths against individual words, if you have a word list, is easy. But where i was stumped was matching them against phrases containing multiple words. Trying every possible sequence of words, even lazily assembled, felt like it would be a combinatorial explosion that i would never be able to get through.

Using a corpus of phrases from wikipedia is absolute genius.

[1] https://www.gchq.gov.uk/news/christmas-card-2020 - note that the edge with a double arrow is really two edges, one in each direction

tgv · on Feb 12, 2021

That's the first time I see a regular expression implementation with intersection, apart from my own from many, many moons ago. A pleasant surprise. I also had added a negation operator, which adds another level of unintuitive behavior. E.g., you might think that !a would block aa from matching, but it doesn't.

The anagram operator is a new one for me. Nice.

pwiecz · on Feb 12, 2021

I've once implemented a regular expression engine based directly on https://en.wikipedia.org/wiki/Brzozowski_derivative – it's straightforward to implement both intersection and negation this way.

_pvxk · on Feb 12, 2021

hfst (open source rewrite of the Xerox finite state toolkit) includes negation:

    $ echo 'a b
    c b
    c d*' | hfst-regexp2fst > ab.fst
    $ echo 'c ?+' | hfst-regexp2fst >cdotplus.fst
    $ hfst-intersect ab.fst cdotplus.fst | hfst-expand -c3
    cb
    cd
    cdd
    cddd
    cdddd
    $ hfst-expand -c3 ab.fst
    ab
    cb
    c
    cd
    cdd
    cddd

The regex syntax is a bit quirky due to backwards compatibility with lexicons written in XFST, see https://github.com/hfst/hfst/wiki/Regular-Expression-Operato...

goodmachine · on Feb 12, 2021

I recommend 'o the pelican' - an entertaining subreddit of folded poetry found in tweets

https://www.reddit.com/r/othepelican/comments/86z3zk/o_the_p...

Zanni · on Feb 12, 2021

That's a delightful anecdote to kick off a computational challenge.

berkes · on Feb 12, 2021

> Not all of those statements would have been suited to the occasion of our rendezvous at the Maple Diner, but over the course of our years together—17 years, as it turned out—there came a moment for each of them.

Did make me sad, though. Considering the last message found in the text.

dqpb · on Feb 12, 2021

Those 8 lines are a poetic sequence describing a life with a partner. I couldn't help myself though, I had to move one line:

I love you.

I tease you.

I pleasure you.

I please you.

I pester you.

I peeve you.

I salute you.

I leave you.

ineptech · on Feb 12, 2021

Yes, this is like a projecteuler question collided with a short story.

mode80 · on Feb 12, 2021

I really hope I’m this sharp and poetic when I’m 71.

sitkack · on Feb 13, 2021

I hope you are too, you can crack me jokes I won't get.

oh_sigh · on Feb 12, 2021

Wouldn't the regex be the phrase interleaved with question marks: i?t?s?a?p?l?e? etc?

Using that on my /usr/share/dict/words, I get 634 matches, which is about in line with what the author got.

tgv · on Feb 12, 2021

Apart from the ^...$, it's exactly that.

nayuki · on Feb 12, 2021

The mathematical term for dropping letters is taking a "subsequence", https://en.wikipedia.org/wiki/Subsequence . This is not the same as taking a substring, which is understood to be a contiguous subsequence.

Waterluvian · on Feb 12, 2021

I want to know more about Lynn.

endtime · on Feb 12, 2021

I really wish this didn't fold to the phrase "I wank now more ably".

elicash · on Feb 12, 2021

The Mad Magazine Fold-In Effect in CSS:

https://thomaspark.co/2020/06/the-mad-magazine-fold-in-effec...

moolcool · on Feb 12, 2021

https://i.imgur.com/iAPlVCr.mp4

adius · on Feb 12, 2021

A true nerd: Gets a love letter - Writes a lengthy computational analysis what else could have been written =P. Joking aside, really interesting blog post