antipaucity

fighting the lack of good ideas

fallocate vs dd for swap file creation

I recently ran across this helpful Digital Ocean community answer about creating a swap file at droplet creation time.

So I decided to test how long using my old method (using dd) takes to run vs using fallocate.

Here’s how long it takes to run fallocate on a fresh 40GB droplet:

root@ubuntu:/# rm swapfile && time fallocate -l 1G /swapfile
real	0m0.003s
user	0m0.000s
sys	0m0.000s

root@ubuntu:/# rm swapfile && time fallocate -l 2G /swapfile
real	0m0.004s
user	0m0.000s
sys	0m0.000s

root@ubuntu:/# rm swapfile && time fallocate -l 4G /swapfile
real	0m0.006s
user	0m0.000s
sys	0m0.004s

root@ubuntu:/# rm swapfile && time fallocate -l 8G /swapfile
real	0m0.007s
user	0m0.000s
sys	0m0.004s

root@ubuntu:/# rm swapfile && time fallocate -l 16G /swapfile
real	0m0.012s
user	0m0.000s
sys	0m0.008s

root@ubuntu:/# rm swapfile && time fallocate -l 32G /swapfile
real	0m0.029s
user	0m0.000s
sys	0m0.020s

Interestingly, the relationship of size to time is non-linear when running fallocate.

Compare to building a 4GB swap file with dd (on the same server, it turned out using either a 16KB or 4KB bs gives the fastest run time):

time dd if=/dev/zero of=/swapfile bs=16384 count=262144 

262144+0 records in
262144+0 records out
4294967296 bytes (4.3 GB, 4.0 GiB) copied, 4.52602 s, 949 MB/s

real	0m4.528s
user	0m0.048s
sys	0m4.072s

Yes, you read that correctly – using dd with an “optimum” bs of 16KB (after much testing different bs settings) takes ~1000x as long as using fallocate to create the same “size” file!

How is fallocate so much faster? The details are in the man pages for it (emphasis added):

fallocate is used to manipulate the allocated disk space for a file, either to deallocate or preallocate it. For filesystems which support the fallocate system call, preallocation is done quickly by allocating blocks and marking them as uninitialized, requiring no IO to the data blocks. This is much faster than creating a file by filling it with zeroes.

dd will “always” work. fallocate will work almostall of the time … but if you happen to be using a filesystem which doesn’t support it, you need to know how to use dd.

But: if your filesystem supports fallocate (and it probably does), it is orders of magnitude more efficient to use it for file creation.

simple ip address check – ipv4.cf

I’ve published another super-simple tool.

A la whatismyip.com, but with no extra cruft (and no queer formatting of the IP address under the hood), welcome IPv4.cf to the world with me!

wonder how many zombie film/tv/game creators are/were computer science nerds

As you all know, I am a huge zombie fan.

And, as you probably know, I was a CIS/CS major/minor at Elon.

A concept I was introduced to at both Shodor and Elon was ant colony simulations.

And I realized today that many people have been introduced to the basics concepts of ant colony simulations through films like Night of the Living Dead or World War Z and shows like Z Nation or The Walking Dead.

In short, ant colony optimization simulations, a part of swarm intelligence, use the “basic rules” of ant intelligence to game-out problems of traffic patterns, crowd control, logistics planning, and all kinds of other things.

Those basic rules the ants follow more or less come down to the following:

  • pick a random direction to wander
  • continue walking straight until you hit something
    • if you hit a wall
      • turn a random number of degrees between 1 and 359
      • loop up one level
      • if you hit food
        • if you are carrying trash
          • turn a random number of degrees between 1 and 179 or 181 and 359
          • loop up two levels
        • if you are carrying food
          • drop it
          • turn 180 degrees, loop up two levels
        • if you are not carrying anything
          • pick it up
          • either turn 180 degrees and loop up two levels, or
          • loop up two levels (ie, continue walking straight)
      • if you hit trash (dead ants, etc)
        • if you are carrying trash
          • drop it
          • turn 180 degrees, loop up two levels
        • if you are carrying food
          • turn a random number of degrees between 1 and 179 or 181 and 359
          • loop up to levels
        • if you are carrying nothing
          • pick it up
          • either turn 180 degree and loop up two levels, or
          • loop up two levels (ie, continue walking straight)
      • if you hit an ant
        • a new ant spawns in a random cell next to the two existing ants (with a 1/grid-shape probability, in a square grid, this would be a ~10% chance of spawning a new ant; in a hex grid, it would be a ~12.5% chance of a spawn), IF there is an empty cell next to either ant
        • if you are both carrying the same thing,
          • start a new drop point
          • turn around 180 degrees
          • loop up two levels
        • if you are carrying different things (or not carrying anything)
          • turn a random number of degrees between 1 and 359
          • loop up two levels
    • if you have been alive “too long” (parameterizable), you die and become trash (dropping whatever you have “next” to you in a random grid point (for example, if the grid is square, you’re in position “5”, and your cargo could be in positions 1-4 or 6-9:

There are more rules you can add or modify (maybe weight your choice of direction to pick when you turn based on whether an ant has been there recently (ie simulated pheromone trails)), but those are the basics. With randomly-distributed “stuff” (food, walls, ants, trash, etc) on a board of size B, an ant population – P – of 10% * B, a generation frequency – F – of 9% * B, an iteration count of 5x board-size, a life span – L – of 10% * B, and let it run, you will see piles of trash, food, etc accumulate on the board.

They may accumulate differently on each run due to some of the random nature of the inputs, but they’ll accumulate. Showing how large numbers of relatively unintelligent things can do things that look intelligent to an outside observer.

And that’s how zombies appear in most pop culture depictions: they wander more-or-less aimlessly until attracted by something (sound, a food source (aka the living), fire, etc). And while they seem to exhibit mild group/swarm intelligence, it’s just that – an appearance to outside observers.

So if you like zombie stories, you might like computer science.

pi-hole revisited

Back in November, I was really up on Pi-hole.

But after several more months of running it … I am far less psyched than I had been. I’m sure part of that is having gotten better internet services at my house – so the impact of ads is less noticeable.

But a major part of it is that Pi-hole is just too aggressive. Far far too aggressive. Aggressive to the point that my whitelist was growing sometimes minute-by-minute just to get some websites to work.

Is that a problem with the site? No doubt somewhat. But it’s also a problem of blacklists. When domains and IPs are just blanket refused (and not in a helpful way), you get broken experience.

Pi-hole has also gone to a quasi-hijack approach: when a domain has been blocked, instead of it just silently not working, it now returns a message to contact your Pi-hole admin to update the block lists.

I hate intrusive ads as much as the next person .. but that shouldn’t mean that all ads are blocked. I have unobtrusive ads on a couple of my domains (this one included).

But even with Pi-hole, not all ads are blocked.

Part of that is due to the ever-changing landscape of ad servers. Part of it is due to the inherent problems with the blacklist/whitelist approach.

Content creators should be entitled to compensation for the efforts (even if they voluntarily choose to give that content away). Bombarding visitors with metric buttloads of advertising, however, makes you look either desperate, uncaring, or greedy.

The current flipside to that, though, is the pay-wall / subscription approach. Surely subscriptions are appropriate for some things – but I’m not going to pay $1/mo (or more) to every site that wants me to sign-up to see one thing: just today, that would’ve encumbered me with over $100/mo in new recurring bills.

Maybe there needs to be a per-hour, per-article, per-something option – a penny for an hour, for example (which, ftr, comes out to a monthly fee of about $7)- so that viewers can toss some scrilla towards the creators, but aren’t permanently encumbered by subscriptions they’ll soon forget about (though, of course, that recurring subscription revenue would surely look enticing to publishers).

As with the per-song/episode purchase model that iTunes first made big about 15 years ago, you could quickly find out what viewers were most interested in, and focus your efforts there. (Or, continue focusing your efforts elsewhere, understanding that less-popular content will not garner as much revenue as popular content will).

Imagine, using my example of $0.01/hr, how much more engagement you could end up garnering while visitors are actively on your site! A penny is “nothing” to most people – and probably just about all who’re online. Maybe you’ll have a handful of people “abusing” the system by opening a thousand pages in new tabs in their hour … but most folks’ll drop the virtual coin in the nickelodeon, watch the video / read the page / whathaveyounot, and move on about their day.

And not everyone will opt for the charge model. Sites that do utilize it can have some things marked “free” or “free for the next 24 hours” or “free in 7 days” or whatever.

Ad companies like Google could still work as the middleman on handling transactions, too – any time you visit per-X content, there could be a small pop-up that indicated you’d be withdrawing Y amount from your balance to view the site (I’m sure there’ll be competition in the space, so PayPal, Facebook, Stripe, Square, etc etc can get in on the “balance management” piece). And at the end of whatever period (day, week, month), Google can do a mass-settle of all the micropayments collected for each site from each visitor (with some percentage off the top, of course).

No ads. You’d actually Get What Your Pay For™, and issues like the recent Admiral thing would go in a corner and die.

i wrote a thing – paragraph, a simple plugin for wordpress

Along with becoming more active on Mastodon,  I’ve been thinking more about concision recently.

One of the big selling points for Mastodon is that the character limit per post is 500 instead of Twitter’s 140.

And I was thinking, “what if there was a way to force you to write better by writing less / more compactly / more concisely?”

So after a couple weeks, I sat down and wrote an incredibly simple WordPress plugin. Introducing Paragraph.

Paragraph removes all formatting of a post or page, effectively turning it into a wall of text.

How does this help you?

If you see your writing as an uninterrupted wall of text – or a “paragraph” – you may notice that what you’re trying to say is getting lost in the noise.

It could also help force you to write more often but shorter each time.

Or maybe you’ll find it completely useless: and that’s OK, too.

update: keeping your let’s encrypt certs up-to-date

Last year I posted a simple script for keeping your Let’s Encrypt SSL certificates current.

In conjunction with my last post sharing the “best” SSL configs you can use with Apache on CentOS, here is the current state of the cron’d renewal script I use.

systemctl stop httpd.service
systemctl stop postfix
~/letsencrypt/letsencrypt-auto -t -n --agree-tos --keep --expand --standalone certonly --rsa-key-size 4096 -m user@domain.tld -d domain.tld
# you can append more [sub]domains to a single cert with additional `-d` directives ([-d otherdomain.tld [-d sub.domain.tld...]])
#...repeat for every domain / domain group
systemctl start httpd.service
systemctl start postfix

I have this script running @weekly in cron. You should be able to get away with doing it only every month or two .. but I like to err on the side of caution.

I’m stopping and starting Postfix in addition to httpd (Apache on my system) for only two reasons: first, I am using some of the LE-issued certs in conjunction with my Postfix install; second, because I don’t know if Dovecot and my webmail system need to make sure Postfix is restarted if underlying certs change.

ssl configuration for apache 2.4 on centos 7 with let’s encrypt

In follow-up to previous posts I’ve had about SSL (specifically with Let’s Encrypt), here is the set of SSL configurations I use with all my sites. These, if used correctly, should score you an “A+” with no warnings from ssllabs.com. Note: I have an improved entropy package installed (twuewand). This is adapted from the Mozilla config generator with specific options added for individual sites and/or to match Let’s Encrypt’s recommendations.

Please note: you will need to modify the config files to represent your own domains, if you choose to use these as models.

[/etc/httpd/conf.d/defaults.conf]

#SSL options for all sites
Listen 443
SSLPassPhraseDialog  builtin
SSLSessionCache         shmcb:/var/cache/mod_ssl/scache(512000)
SSLSessionCacheTimeout  300
Mutex sysvsem default
SSLRandomSeed startup builtin
SSLRandomSeed startup file:/dev/urandom  1024
# requires twuewand to be installed
SSLRandomSeed startup exec:/bin/twuewand 64
SSLRandomSeed connect builtin
SSLRandomSeed connect file:/dev/urandom 1024
SSLCryptoDevice builtin
# the SSLSessionTickets directive should work - but on Apache 2.4.6-45, it does not
#SSLSessionTickets       off
SSLCompression          off
SSLHonorCipherOrder	on
# there may be an unusual use case for enabling TLS v1.1 or 1 - but I don't know what that would be
SSLProtocol all -SSLv2 -SSLv3 -TLSv1 -TLSv1.1
SSLCipherSuite ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256
SSLOptions +StrictRequire
SSLUseStapling on
SSLStaplingResponderTimeout 5
SSLStaplingReturnResponderErrors off
SSLStaplingCache        shmcb:/var/run/ocsp(128000)

#all unknown requests get domain.tld (over http)
<VirtualHost *:80>
    DocumentRoot /var/html
    ServerName domain.tld
    ServerAlias domain.tld *.domain.tld
    ErrorLog logs/domain-error_log
    CustomLog logs/domain-access_log combined
    ServerAdmin user@domain.tld
    <Directory "/var/html">
         Options All +Indexes +FollowSymLinks
         AllowOverride All
         Order allow,deny
         Allow from all
    </Directory>
</VirtualHost>

SetOutputFilter DEFLATE
AddOutputFilterByType DEFLATE text/html text/plain text/xml text/javascript text/css text/php

[/etc/httpd/conf.d/z-[sub-]domain-tld.conf]

<Virtualhost *:80>
    ServerName domain.tld
# could use * instead of www if you don't use subdomains for anything special/separate
    ServerAlias domain.tld www.domain.tld
    Redirect permanent / https://domain.tld/
</VirtualHost>

<VirtualHost *:443>
    SSLCertificateFile /etc/letsencrypt/live/domain.tld/cert.pem
    SSLCertificateKeyFile /etc/letsencrypt/live/domain.tld/privkey.pem
# if you put "fullchain.pem" here, you will get an error from ssllabs
    SSLCertificateChainFile /etc/letsencrypt/live/domain.tld/chain.pem
    DocumentRoot /var/www/domain
    ServerName domain.tld
    ErrorLog logs/domain-error_log
    CustomLog logs/domain-access_log \
          "%t %h %{SSL_PROTOCOL}x %{SSL_CIPHER}x \"%r\" %b"
    ServerAdmin user@domain.tld

# could put this in defaults.conf - I prefer it in each site config
    SSLEngine on

<Files ~ "\.(cgi|shtml|phtml|php3?)$">
    SSLOptions +StdEnvVars
</files>
<Directory "/var/www/cgi-bin">
    SSLOptions +StdEnvVars
</Directory>

SetEnvIf User-Agent ".*MSIE.*" \
         nokeepalive ssl-unclean-shutdown \
         downgrade-1.0 force-response-1.0

    <Directory "/var/www/domain">
         Options All +Indexes +FollowSymLinks
         AllowOverride All
         Order allow,deny
         Allow from all
    </Directory>

</VirtualHost>

I use the z....conf formatting to ensure all site-specific configs are loaded after everything else. That conveniently breaks every site into its own config file, too.

The config file for a non-https site is much simpler:

<VirtualHost *:80>
    DocumentRoot /var/www/domain
    ServerName domain.tld
    ServerAlias domain.tld *.domain.tld
    ErrorLog logs/domain-error_log
    CustomLog logs/domain-access_log combined
    ServerAdmin user@domain.tld
    <Directory "/var/www/domain">
         Options All +Indexes +FollowSymLinks
         AllowOverride All
         Order allow,deny
         Allow from all
    </Directory>
</VirtualHost>

If you’re running something like Nextcloud, you may want to turn on Header always set Strict-Transport-Security "max-age=15552000; includeSubDomains" in the <VirtualHost&gt directive for the site. I haven’t decided yet if I should put this in every SSL-enabled site’s configs or not.