Web pages are predictably untrustworthy to remain online as they were when you dropped by to see them. They can change while their address remains the same, can redirect an address to someplace else entirely, and entire websites can disappear meaning there’s no response when attempting to visit an address.
This means that when I link to something here, there’s no guarantee at all that when you click such a link that you actually get to see what I thought I’ve linked to. And no, screenshots don’t help: fake is easy and we need to back up our words with hyperlinks.
The same is true for stuff I don’t link to here, but save to my archive. That’s why I don’t just save URLs but the entire article for future reference to my markdown notes. That still means the actual source might disappear, without me having a way of proving what I saved is what I saw. This not only is relevant to the content itself, but also for instance for licensing information. There are photos in this blog that were openly licensed when I used them, but no longer, leaving it impossible for me to prove I still can use the image because of the license at the time.
This makes an archiving service useful, like Archive.org. I can use that to store URLs I find interesting and I do that with some regularity. It is why I am a monthly donor to the Web Archive, I’d like it to remain a more robust reference point on the web.
Currently I have one way of adding web pages to an archive, using the Wayback Machine add-on in my browser. The same add-on helps me find previous versions of a page already archived, tweets about that page, and annotations made by others. Very useful, during browsing.
The Web Archive browser add-on bookmarklet
Writing blogposts and saving webpages as markdown in my local notes, or starting to annotate a page in Hypothes.is are another matter however. There I’d like to automate getting or creating an archive link.
In all cases it would need to be an archive link next to the original. If I link to something in a blogpost here I want to still send WebMentions to the linked site, and that requires the link to the original to be in my posting. Similarly for my notes, I want to have the original url as well, although it would be reconstructable from the archive link. For online social annotations in Hypothes.is, the original link is needed because that is how you find other people’s annotations alongside your own. The last one is probably easiest, by using the browser add-on manually and adding the result as a first annotation for instance.
An archive link as first Hypothes.is annotation
On the page in the IndieWeb wiki about using the Internet Archive there are some code snippets to be found to use with the Archive’s API, or using the basic string to save something https://web.archive.org/save/urlhere
. It also mentions bloggers who either send the URLs they mention, or their own postings, or both (e.g. when sending a WebMention) to the Internet Archive.
When posting to my blog from my local markdown notes I could potentially add a function to the markdown-to-html parser I use where it detects external links, runs them through the Archive and writes the html for both the direct and the archived link.
For saving web articles as local notes in markdown there are several options to explore:
- When saving from the browser using the Markdownload add-on, first saving and copying the saved url using the Achive add-on, then pasting that archive link in the dialog box.
- Adding
[Opslaan in Internet Archive](https://web.archive.org/save/{baseURI})
to the Markdownload template so I can directly save a URL from within the local note later, if wanted. I added this experimentally, to see if I would actually use it like this. - When saving to notes from my microsub feedreader I could add a function to the html-to-markdown parser I use there to run external links through the Archive and write the Archive link in markdown after the original link.
@ton I’ve had a few exchanges with IA over the “Save page now” URL and automated submission.They actively endorse its use in this way.I do susepct there’s a prospect for abuse (I’ve seen rate limits / delays in submission where I’ve submitted many manually), but in general, “other people found this link of interest” is in fact a useful archival heuristic. It’s used, for example, in deciding what YouTube content to archive (mentions to Twitter will trigger an archival). That’s discussed on the IA blog.
@dredmorbius thank you, that’s good to know!