Does anyone know how Evernote stores a full webpage in a note? For me storing a full page is key for bookmark management, but I also want to leave Evernote. Formats such as MHTML or webarchive are browser or even OS dependent. Is there something that does that script based, so I could run it in background?
Peter Rukavina mailed me:
I saved this post to Evernote using three different methods:
1. Dragged-and-dropped the URL from the address bar of Firefox onto the Evernote icon in my dock.
2. Installed the Evernote Web Clipper add-on for Firefox, and saved the post as an “Article”.
3. Used the same Firefox add-on to save the post as “Full Page.:
With three notes in Evernote representing this post, I exported the notes to a single ENEX file (https://evernote.com/blog/how-evernotes-xml-export-format-works/), Evernote’s open XML format for exporting and importing notes.
Looking at the XML file, I see that each of the three versions includes a “content” element that stores the HTML of the post as ENML, Evernote’s markup language that’s an extension of XHTML. Elements of the page that would otherwise be pointers to external files — PNGs, JPEGs, etc. — are stored as base64-encoded (https://en.wikipedia.org/wiki/Base64) “resource” elements in the ENEX file.
If your goal is to find a replacement for Evernote for making new archives of web pages, then you’ll need a new solution, like w get or httrack.
However if your goal is to export the web pages you already have in Evernote and import them into whatever solution you find to replace it, I suspect that you could export all of your notes, and then programmatically recreate the pages from the ENEX files.
Evernote also has the ability to export notes as HTML, which might, in fact, be a better solution, as you don’t need to worry about parsing XML, and instead end up with a copy of the HTML for each page, linked to copies of the external images, etc. needed to recreate the page.