Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Fantastic talk. Loved it! Thanks for the post and the commenters who recommended it.

Formatting was terrible -- even when viewing source!

This made it somewhat readable

    import requests
    from bs4 import BeautifulSoup
    url = 'https://jackrusher.com/strange-loop-2022/'
    bs = BeautifulSoup(requests.get(url).text)
    muh_text = ' '.join([x for x in bs.stripped_strings if not x.startswith('00')])
    print(muh_text)



I went with this JavaScript:

    let c;
    for (let p of document.querySelectorAll('body>p, body>div')) {
      if (p.classList.contains('aside')) { c = undefined; continue; }
      p.querySelector('span.time').remove();
      if (c) {
        c.innerHTML += ' ' + p.innerHTML; p.remove();
      } else { c = p; }
    }
And this CSS:

    body p { width: 100%; }
    body div.aside { width: 100%; border: 1px solid black; }
I'm thinking maybe that page was supposed to be embedded somewhere, next to a video maybe? It wasn't meant to be read like this right?


Really shoddy DRM maybe?


Definitely not!

Sorry you hated the formatting. The transcript is meant to be an assistive technology for the video, and a place to put extra notes I couldn't fit into the time I had. Ideally, the transcript would scroll as the video advances and the timestamps would move the playhead to that part of the talk, but I haven't time this week to do as much hacking on that as I'd like.


Absolutely no issue! I've only just come across your work and am astounded at both its breadth and depth.

I thought perhaps the page wasn't owned by you, and that someone had ripped it from subtitles or something similar.

The link for the article looks to have changed; before it was to a website that had text next to timestamps.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: