To remove AMP cruft:
curl http://web.archive.org/web/20200111193123/https://www.wsj.com/amp/articles/paging-dr-google-how-the-tech-giant-is-laying-claim-to-health-data-11578719700 \ | sed -n '/./{/<p>By<.p>/,/<\/div>/p;/<title>/,/<\/title>/p;};/<p>By/!{/<p>/p;};/=.sub-head/p;/./{/<h1 /,/<.h1>/p;}' > a.html firefox file://a.html
To remove AMP cruft: