Early this month someone at the Dutch government entity for standards in a session asked about standards for maintaining narrative data (story collections). I don’t know any, and haven’t seen them either. Today someone else asked about the same thing. Narrative data contains a free format element (the ‘story’) and a range of qualifiers (either set by the story teller as sensemaking, or by the researcher as descriptors I imagine). I can see how you might need to declare a data structure, describing the qualifiers, and then add the data itself. This is perfectly doable in things like XML or JSON, but do any standards for it exist? A first search for data standards in qualitative research yields only checklists for designing qualitative research and reporting on it, which is something different.

Elicit.org surfaces a few papers based on the title above as prompt, that don’t seem to directly apply but might mention something worthwile.

One of the two asking me about it also mentioned Norma Bateson’s notion of ‘warm data’, which is data that preserves the interconnectedness of aspects in a complex setting and is sensitive to patterns (I’m reminded here of Gibson’s nodal points). While that provides a label for qualitative data, and adds patterns and constellations as a data item, I don’t see if it’s immediately useful for the question at hand about standards. Bateson’s explanations aren’t helpful here, as it mostly is a sales pitch to become a ‘warm data host’, whatever that means. This Bateson’s father is Gregory Bateson, which is an interesting link, w.r.t. mental maps consisting of ‘differences that make a difference’ (that Norma Bateson adopts into ‘warm data’ btw). It would be quite nice to be able to describe such things consistently.

If there’s isn’t a standard, getting to one would be quite ambitious. Creating a standard requires bringing on board at least some variety of stakeholders whose use cases can be covered by the prospective standard. Otherwise you’re just creating your own data format. Which in itself can be useful (I created one for my own booklists here), if only as a proof of concept, and at least it means the proposal is in actual use.

In reply to Creating a custom GPT to learn about my blog (and about myself) by Peter Rukavina

It’s not surprising that GPT-4 doesn’t work like a search engine and has a hard time surfacing factual statements from source texts. Like one of the commenters I wonder what that means for the data analysis you also asked for. Perhaps those too are merely plausible, but not actually analysed. Especially the day of the week thing, as that wasn’t in the data, and I wouldn’t expect GPT to determine all weekdays for posts in the process of answering your prompt.

I am interested in doing what you did, but then with 25 years of notes and annotations. And rather with a different model with less ethical issues attached. To have a chat about my interests and links between things. Unlike the fact based questions he’s asked the tool that doesn’t necessarily need it to be correct, just plausible enough to surface associations. Such associations might prompt my own thinking and my own searches working with the same material.

Also makes me think if what Wolfram Alpha is doing these days gets a play in your own use of GPT+, as they are all about interpreting questions and then giving the answer directly. There’s a difference between things that face the general public, and things that are internal or even personal tools, like yours.

Have you asked it things based more on association yet? Like “based on the posts ingested what would be likely new interests for Peter to explore” e.g.? Can you use it to create new associations, help you generate new ideas in line with your writing/interests/activities shown in the posts?

So my early experiments show me that as a data analysis copilot, a custom GPT is a very helpful guide… In terms of the GPT’s ability to “understand” me from my blog, though, I stand unimpressed.

Peter Rukavina

In reply to Stop Using AI-Generated Images by Robin Rendle

This specific argument sounds way too much like the ‘every music download is a missed saletrope of the music industry, 2-3 decades ago. And that was about actual recordings, not generated ones.

There are many good reasons to not use algogens, or to opt for different models for such generation than the most popular public facing tools.

Missed income for illustrators by using them in blog posts as Rendle posits isn’t a likely one.
Like with music downloads there’s a whole world of potential users underneath the Cosean floor. Several billion people, I assume.

My blog or presentations will never use bought illustrations.
I started making lots of digital photos for that reason way back in 2003, and have been using open Creative Commons licenses whenever I used something by another creator since they existed.
These days I may try to generate a few images, if it’s not too work intensive. Which it quickly is (getting a good result is hard and takes more time than a Flickr search for an image) and then paying someone who can manipulate the tools better than I (like now with Adobe Illustrator) might be in order, except for my first point at the start of this paragraph. Let’s also not pretend that all current illustrators do everything by hand or have done since the 1990’s. There’s tons of art work out there that is only possible because of digital tools. Algogens aren’t such a big jump that you get to treat them as a different species.

Not to say that outside the mentioned use case of blogs and other sites (the ones that already now are indistinguishable from generated texts and only have generating ad eyeballs as purpose), the lower end of the existing market won’t get eroded unless the lower end of the market ups their game with these tools (they’re actually better positioned for it than me in terms of skills and capabilities).
I bet that at the same time there will be a growing market for clearly human made artefacts as status symbol too. The Reverse Turing effect in play, as the default assumption will be, must be, that anything is generated. I’ve paid more for prints of artwork, both graphics and photos, made by or at the behest of the artist, than one printed after their death for instance. Such images adorn the walls at home rather than my blog though.

I always thought those AI-generated images in blog posts and elsewhere around the web were tacky and lame, but I hadn’t seriously considered the future economic impact on illustrators.

Robin Rendle

Image generated with Stable Diffusion using the prompt “A woman is sitting behind a computer making an image of a woman drawing, sketchbook style”. The result is an uncanny image. Public domain image.

It was our second week of four in Lucca in July 2015. We were there to heal. It was very hot, and we had quickly settled into a rhythm of morning coffee in one of the many tiny streets still following the original Roman street pattern, an early lunch out or quick salad at home before hiding during the hottest hours in our air conditioned apartment, and heading out again late afternoon for wine followed by dinner al fresco or walking the city walls.

One such morning after sipping our coffees we strolled past the square that still follows the contour of the amfitheater that once stood there, down the Via della Fratta and came across the Lucca center for contemporary art, Lu.C.C.A. It had a retrospective of the work of photographer Elliott Erwitt.

Lucca center for contemporary art as seen in 2015 with the Erwitt banner on the facade. The center closed indefinitely in June 2021.

Born in Paris in 1928 to Jewish parents from Russia, after his early childhood in Milan he emigrated to the USA in his early teens just before the second world war. After the war he photographed in France and Italy, and joined Magnum in 1953.

It was a surprise to find this photographer and his work inside the walls of an ancient Tuscany town.
We enjoyed the love of irony and the candid shots of the little absurdities of life. Sometimes it took a moment to realise what we were seeing. His images made us smile, in a year that generally didn’t.

We bought two poster sized prints of Elliott Erwitts photos in April 2021. One the ‘dog legs’ photo, taken in NYC in 1974. The other a 1968 image taken in the Florida Keys. Both we had seen in the retrospective in Lucca. Since having them framed they hang above the piano in E’s home office.

Erwitt died this week at the age of 95. His work will continue to make me smile whenever I walk into E’s home office.

The catalogue of the Lucca Erwitt retrospective in 2015, that I pulled from the book shelves to leaf through today.

I have lots of images in my Flickr account, over 35k uploaded from April 2005 until now. Searching them by date isn’t made easy: you first have to do a search and then in the advanced filters set a date range. The resulting search URL however is straightforward, and contains my user id, and both the start and end date to search in unix time.

To make it easy for myself I made a tiny tool that has a locally hosted webform where I enter a start and end date to search, and turns that into the right search URL which it shows as clickable hyper link. Meaning with two clicks I have the images I searched for by date in front of me (if they exist).

The small piece of code is shown below (save it as a .php file, insert your Flickr user ID, and run it on a local webserver on your laptop, add its URL to your bookmark bar in your browser)

/* form om te zoeken in Flickr */
if ($_POST) {
$dvan = $_POST['van'];
$dtot = $_POST['tot'];
$uvan = strtotime($dvan);
$utot = strtotime($dtot);
$searchurl=$base.$uvan."&max_taken_date=".$utot."&view_all=1"; echo "<a href=".$searchurl.">".$searchurl."</a>";
/* begin form */
<form name="input_form" method="POST" action="<?php echo $_SERVER['PHP_SELF'] ?>">
Search Flickr (date format yyyy-mm-dd)<br/>
From <input type='text' size='20' name='van'><br/>
To <input type='text' size='20' name='tot'><br/> <br/>
<input type="submit" name="submitButton" value="Zoek">

I realised I had an ical file of all my appointments from the period I used Google Calendar from January 2008 when I started as an independent consultant, until February 2020 when I switched to the calendar in my company’s NextCloud.
I never search through that file even though I sometimes wonder what I did at a certain moment in that period. After a nudge by Martijn Aslander who wrote on a community platform we both frequent about back filling his daily activities into Obisidian for instance based on his photos of a day through the years in his archive, I thought to import that ical file and turn it into day logs listing my appointments for a date.

I tried to find some ready made parsers or libraries I could use in PHP, but most I could find is aimed at importing a live calendar rather than an archived file, and none of them were aimed at creating an output of that file in a different format. Looking at the ical file I realised that making my own simple parser should be easy enough.

I write a small PHP script that reads the ical file line by line until it finds one that says BEGIN:VEVENT. Then it reads the lines until it finds the closing line END:VEVENT. It then interprets the information between those lines, lifting out the date, location, name and description, while ignoring any other information.
After each event it finds, it writes to a file ‘Daylog [date].md’ in a folder ./year/month (creating the file or appending the event as a new line if the file exists). It uses the format I use for my current Day logs.
Let it repeat until it processed all 4.714 events in my calendar from 2008 to 2020.

A screenshot of all the folders with Daylogs created from 2008-2020

Screenshot of the newly created Daylog for 20 February 2008, based on the appointments I had that day, taken from the calendar archive. This one mentions a preparatory meeting for the open GovCamp I helped organise that year in June, which kicked off my work in open data since then.