Write Only Log for August 2025 - osric.uk

Backup script is nearly working, except it's changing the permissions/ownership of db files. Which breaks the blog. Which has cost me at least one entry. Poot.

If I want to create a container for running automated tests, I need to give it databases in a known state. I can use dotnet ef to create empty databases (TODO: Look up how to seed a database without creating a migration, or at least make it optional. I can use SQL scripts to populate most of my databases, but creating a password for a test user is a bugger).

Alright, it's the password that's the main blocker. Options:

  • create a password using the ui (once) and copy it out of the db. Fiddly, but only once.
  • Use the spectre cli stuff to add a create password/user mode to the app.

That second option I like a lot. Let's do that.

Hello world, do you like the new look? I think I'm a fan.

I've stripped out almost all of the css, only putting back enough to try to pin the nav menu to the top of the page. I'm only using system colours and default fonts, and I added a few lines to the blog archive that means the month tables wrap.

(Oh, yeah. I replaced the css grid month tables with, y'know, actual tables. Seriously, css grid is fun, but calendars are close to ideal table data.)

This blog entry box needs a "max-width: page-width", but I'll deal with that later. I've just come back from town and gosh, that place is hard to cope with!

On the plus side, I came back with new XL trousers, a significant improvement over the 5X trousers I bought last time. I tried L, but they were a little tight. Maybe in a month or two.

Think I've got the fixed menu working now, thanks to a stack overflow answer I'm using box-shadow to draw outside the box of the element, so I can keep it at 8px from the top (standard body margin) and still make those 8px opaque.

I wonder when CSS is going to get references, so I can do background-color: ref(body.background-color); (yes, I could do something similar with variables, but I don't want to have to specify everything, I want to use whatever the user (browser) thinks is default.

Ah, well.

In tonight's episode of "Fun with pipelines", we've got this monstrosity:

awk '{ print $3 }' logs/* |  
	sort -n |
	uniq |
	tail +6 |
	awk '{print "-x " $1 }' |
	xargs dig +noall +answer |
	rev |
	awk -F. '{print $2 "." $3}' |
	rev |
	sort |
	uniq -c |
	sort -nr |
	head -n 10

which produces an output something like:

    349 amazonbot.amazon
    186 ahrefs.net
     67 yandex.com
     47 msn.com
     39 semrush.com
     16 petalsearch.com
     14 babbar.eu
     10 googleusercontent.com
     10 google.com

telling us that amazon used nearly 350 different source IPs to make requests to this server.

Explanation

It's bash, so the | at the ends of the lines means "pass the output of this command to the next command".

awk '{ print $3 }' logs/*

Awk is a small text processing tool. By default, it splits its input into lines (using \n) and then into fields (using spaces and tabs). In this case, print $3 means "print the 3rd (counting from 1) field". We've told awk to read all the files in the logs folder, which have an IP address in the 3rd field.

sort -u

sort takes it's input, splits into lines, sorts it, and prints the output. The -u option tells sort to only output one copy of identical lines (which is what we really want, the sorting is a side effect).

awk '{print "-x " $1 }'

Awk again, this time printing each input line with "-x " in front, which helps us build the next command.

xargs dig +noall +answer

xargs is a neat toy. It takes the input fed to it, and the command you give it, and then it runs the command, passing the command all the input as options.

It's very useful for, well, this kind of situation, where we've got a list of things that we want to run a command with.

The command we've asked it to run is dig, a "DNS lookup utility". Given a hostname, it will find the IP, or (as in our case), given an IP address, it will find the hostname. We have to give it the -x option per address to tell it we're looking for the name. the +noall +answer options limit it's output to the part we care about (the answer!).

rev

rev outputs each line of input reversed.

(Sorry, I ran out of energy at this point. Hopefully I'll come back and finish at some point)

This input box is still too wide, but not by much.

The sticky header isn't on the bookmarks index, although it has picked up the color scheme. I need to pull in the bookmarks database from live to do some proper testing.

Some blog entries have got code blocks wider than the screen, but the browser zooms into fit-to-screen anyway. This breaks the sticky header because it sticks to the page, not the zoomed in viewport the browser uses.

I should go though and make sure all pages have a sensible title. (This new blog page doesn't, for example).

Dammit, I'm all tired and got and grumpy. I haven't even warn my new trousers yet. Going into town knackered me, then I sunk a bunch of energy info the post above about the stupid command line, and it's not like my expected audience actually cares! (Hello babe, I still love you!)

::blows raspberyy::

Right, it's the same problem on the bookmarks page, something is very wide, this makes the browser draw the viewport and the page at different sizes. Chaos ensues.

Different fixes, probably. I'll tell bookmarks to wrap, and I'll tell code blocks to scroll. Tomorrow. I should try sleeping now.

Idea: Give each bookmark a link to fetch.

Also, think of a way of adding a "view source" link to the bottom of all pages, without opening the fetcher to randoms Probably something like only allowing relative links for not logged in connections.

Quietish day, I installed elastic agent on a bunch of machines and then was bug hunting for the rest of the day.

After work (and a sleep, and today's episode of Only Connect) I added some noise into the URL of one of the sites I maintain to stop scrapers from finding it. I should probably just turn on auth, but the site has exactly three users, and it's far less effort to give them a slightly tweaked url, and since the site isn't linked from anywhere else, it shouldn't get found.

(The site does have a TLS certificate from let's encrypt, so the existence of the site is public information, but so long as the root returns a 404, nobody should stumble across the actual content. Unless Facebook strips urls from messenger, for example).

I freely admit this is stupid, but this year's pay offer is out, and it's fine, no worries, except my pay is going to be £1 short of £50k!

(I said it's stupid, yeah?)

Today's pro tip: Do not uninstall the wireless network package that enables the wireless network you are using to reach the machine.

Now I've got "reinstall OS" on the todo list. (For two raspberry pis that I was being all efficient and setting up with ansible.)

Sigh.

I'm getting k3s fettled up again. I've got a VM (tayet) as the server, and a couple of fallow Raspberry Pis as agent nodes. Still not sure what I want to do with a k3s cluster. There's a bunch of stuff to install like step-ca that are services one should have running in your cluster, but I'm working on folding my website down to a monolith, so I don't have that many actual applications.

On the other hand, there is a reasonable chance that we're going to get fast symetric fibre in a couple of months (when the BT contract ends), I'm tempted to bring most of my hosting back home, on principal. Of course, I still need a cloud endpoint for the reputation, and splitting services between cloud and house makes them both slower and less reliable.

Sigh, again. 20 year old me would have gone crazy at the idea of symetric gigabit fibre to the home, but I think husband may be right and we just don't need that kind of bandwidth.

Just got information:

  • My website (that you should be reading this blog on), also has food diary, unfinished bookmark sync toy, and a couple of other projects
  • Webmail (the main excuse for a cloud server, includes a SMTP server, an IMAP server, a few anti spam servers, as well as the website. Oh, and a bunch of storage for emails)
  • Husband's website
  • Private file share, for files to big to reliably email (that we're renting extra storage for)
  • Private container repo
  • Private Nuget repo (That I'm not using any more, I gave up and pulled the packages (and the microsetvices that used them) into a monorepo with the webserver)
  • Oauth2.0 server

Want to add:

  • Metrics/Logs/Traces/Alerts (Prometheus/Grafana used to be my go to choice, but I've soured on them and I don't know why. I'm still looking for something tiny and reliable. I miss MMRTG)

Dealing with the urge to eat a whole bunch of hifi bars. I very much want to, except I also want to get this stupid diet finished (and yes, I do know that the diet never actually finishes. If you're going to be like that, I also know that eating junk doesn't actually help me feel better).

Not feeling the urge/capacity to write much at the moment. I've done a much of tech stuff to the website, depression/lethargy continue to add suck to my life, my body weight is still dropping, life goes on.

New idea: HTTarPit

Listen to a socket, accept connections, wrap the socket in TLS and send the right certificate (use a library for this bit). Use poll to manage sockets, don't need to keep state. While a socket is open, read and drop incoming bytes, write an outgoing space every 20 seconds.

Problem is, looking at the spec, won't work. Http response has to start "HTTP-version SP status-code SP" RFC 9112 Sec 4, so we'll need to keep at least enough state to send that first, although a 13 byte buffer isn't that much.

Not sure what status code to use, but making it configurable solves that problem in the short term.

Sounds like a fun C programming exercise, a fairly classic use of poll that should be able to run under fairly restricted resources and might end up slightly annoying a bot operator.

MVP: Get it working.

Post MVP: Record (and report!) stats. Total connections, bytes received/sent, time connections held open (min/max/average, maybe a histogram once it's clear what size buckets to use). If I wanted to put the time into parsing the HTTP request I could gather stats from headers (UserAgent mostly), and maybe even respond properly to specific URLs (/robots.txt, and maybe something human readable for /), but that significantly increases the complexity and attack surface. But it would be fun.

I hope you all like the new aesthetic for the blog. Minimal, yes? But still a bit expressive.

Something like that, anyway.

Another tooth crumbled, it's the one that was/is half-way through a root canal (started a few months ago, scheduled to be completed end of September). The dentist has slapped in another temp filling, but still, ouch, at least a little bit.

Oh, in technical news, all* pages from this site should be minimized! That's the (generated) HTML and any inline CSS/JS.

*Pages that the minifier can't parse don't get squished

I've written a tag helper that runs the content of any <html> tags through NUglify. (I was already using NUglify to minimize JavaScript files, any .min.js file gets squashed (and cached) on the way out).

It was a little bit of a journey. I wasn't happy with the previous look of the site, so I removed all the CSS. That didn't last very long, but instead of adding <link rel=stylesheet> tags back, I added <style> tags inline in the header, on the grounds that I don't want a lot of CSS, even if I need some, and keeping inline is faster/more efficient than another round trip.

(All the "speed up your site" posts say one should include enough CSS to render the page inline)

However, even a small amount of CSS leads to much whitespace, and I was feeling twitchy. I wrote a quick MinifyCssTagHelper that looked for <style> tags to minify, and that worked surprisingly well, surprisingly quickly. Given how well it worked, it was an obvious next step to try HTML minification, and yeah, that worked to.

It doesn't save much (traditionally, turning on response compression gives far more savings than minification), but it saves something, and it's really cool.

Hello world!

That's been a good long weekend of coding (and that's not counting tomorrows bank holiday!).

Mostly it's been little bits here and there, but there have been lots of them, and I'm much happier with the look of the site now.

  • I've added fieldset tags around most things (those are the boxes with titles)
  • I've tided up/added sub menus to the various areas around the site, and I've got a working version of how to automate them (except maybe blog, which is complicated)
  • I've got a working demo of a drag to reorder module (need to think about no-script versions, maybe)
  • Calendars (on the front page and the blog archive) all now have Monday as the first day of the week! (At friggin' last!)
  • Generated pages (so, like, most of them) are minified on the way out (html as well as inline css and JavaScript) (I should make sure that all the js links are .min.js links)
  • Drawing server side graphs! (Don't think any of those are on the public side yet, but I've also ...)
  • Started collecting http request logs into a database (...so I should be able to sort out an activity graph sooner or later)
  • Adds the skeleton of per user settings (eg, I've put my target weight in and now my weight history gives me an estimated time to target) (33 weeks as of Thursdays weigh in)

Thinking about authorisation ("we know who you are, we're trying to decide if you can do the thing you want to do").

We've got resources to protect - routes that should only work for, and parts of pages that should only display for authorised people.

We've got a list of users (two people is a list!). We can assign arbitrary properties to people including 'roles'. (Roles are magic strings)

I know I should think in terms of "0, 1, lots", but I'm pretty confident that it really is only ever going to me be and husband here, so having three sets of roles ("policies"), one for me, one for them, and one for everyone else shouldn't be real problem.

(I guess I can always just stick it in the database anyway).

New service idea, a POST endpoint that accepts and stores any data, but the path is a token (either a guid or a random word song) that's got various validity checks (can only be used n times, max (and min?) file size, expected content type, source ip, that kind of thing). File gets saved and only the logged in user who created the token can access it.

specific use case is too give Google somewhere to post the data from husbands food diary spreadsheet export, but might be useful to have laying about.

The post entry button is in the wrong place (should be much closer to the input form) and page width is broken again.

Whoop! Achievement unlocked: More than a month's salary in savings!

To remember your current position in the blog, this page must store some data in this browser.

Are you OK with that?