new post: rsyncing my RSS feed database
This commit is contained in:
parent
6e65c5aa90
commit
b166cdfbe5
129
_posts/2024-10-28-rsyncing-my-rss-feed-database.md
Normal file
129
_posts/2024-10-28-rsyncing-my-rss-feed-database.md
Normal file
|
@ -0,0 +1,129 @@
|
||||||
|
---
|
||||||
|
permalink: "/{{ year }}/{{ month }}/{{ day }}/rsyncing-my-rss-feed-database"
|
||||||
|
title: "rsyncing my RSS feed database"
|
||||||
|
published_date: "2024-10-28 20:45:00 +0100"
|
||||||
|
layout: post.liquid
|
||||||
|
data:
|
||||||
|
route: blog
|
||||||
|
excerpt: |
|
||||||
|
At the moment I'm using a web-based RSS feed reader.
|
||||||
|
That works, is reliable and accessible from anywhere.
|
||||||
|
|
||||||
|
However I will soon spend some time with less connectivity
|
||||||
|
and thus were considering my options to have a local feed reader app that works offline.
|
||||||
|
A long while ago I used Newsbeuter and then I recently found its successor Newsboat.
|
||||||
|
That solves the local & offline.
|
||||||
|
But it still requires connectivity to fetch feeds,
|
||||||
|
and for fetching some 100 feeds that's quite a bit of traffic.
|
||||||
|
---
|
||||||
|
|
||||||
|
At the moment I'm using a web-based RSS feed reader[^1].
|
||||||
|
That works, is reliable and accessible from anywhere.
|
||||||
|
|
||||||
|
However I will soon spend some time with less connectivity
|
||||||
|
and thus were considering my options to have a local feed reader app that works offline.
|
||||||
|
A long while ago I used [Newsbeuter] and then I recently found its successor [Newsboat].
|
||||||
|
|
||||||
|
It can import feeds from an OPML file, so that's what I did.
|
||||||
|
|
||||||
|
![Newsboat showing an article on fnordig.de](https://tmp.fnordig.de/blog/2024/newsboat-fnordig.png)
|
||||||
|
|
||||||
|
That solves the local & offline.
|
||||||
|
But it still requires connectivity to fetch feeds,
|
||||||
|
and for fetching some 100 feeds that's quite a bit of traffic.
|
||||||
|
|
||||||
|
So it would be better to fetch that on a server.
|
||||||
|
That's doable for example by sticking `newsboat -x reload` into a cronjob on the server.
|
||||||
|
But then I still need to fetch down the database when I want to read new stuff.
|
||||||
|
|
||||||
|
Doable, but the database quickly grows large enough that this is prohibitive on slow and spotty connections.
|
||||||
|
Oh, and also Newsboat stores whether an article was read and I don't want to lose that.
|
||||||
|
So I need to occasionally sync back the local database to the server,
|
||||||
|
so the unread status is not overwritten.
|
||||||
|
|
||||||
|
[rsync] could handle this.
|
||||||
|
Just push up the local database, reload feeds, sync back the updated database[^2].
|
||||||
|
|
||||||
|
That's when I stumbled upon the new [sqlite-rsync].
|
||||||
|
It's labeled as a "Database Remote-Copy Tool For SQLite"
|
||||||
|
and essentially compares the SQLite pages remotely and locally
|
||||||
|
and only transfers those that are different on the replica.
|
||||||
|
|
||||||
|
To get the tool[^3]:
|
||||||
|
|
||||||
|
```
|
||||||
|
cd /tmp
|
||||||
|
git clone https://github.com/sqlite/sqlite.git
|
||||||
|
cd sqlite
|
||||||
|
./configure
|
||||||
|
make sqlite3_rsync
|
||||||
|
```
|
||||||
|
|
||||||
|
Put the `slite3_rsync` binary both locally and on the server in a folder in the `$PATH`.
|
||||||
|
Then saves this script as `newsboat-sync`:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
#!/bin/bash
|
||||||
|
|
||||||
|
# CHANGE THIS:
|
||||||
|
HOST=user@host
|
||||||
|
LOCAL=~/.newsboat/cache.db
|
||||||
|
REMOTE=.newsboat/cache.db
|
||||||
|
|
||||||
|
echo "Sync up READ status"
|
||||||
|
sqlite3_rsync -v $LOCAL $HOST:$REMOTE
|
||||||
|
|
||||||
|
if [[ "$1" = "-u" ]]; then
|
||||||
|
echo "READ status sync only. Exiting."
|
||||||
|
exit 0
|
||||||
|
fi
|
||||||
|
|
||||||
|
echo "Updating feeds"
|
||||||
|
ssh $HOST ' \
|
||||||
|
[[ $(($(date +%s) - $(cat ~/.newsboat/lastfetch.time 2>/dev/null || echo 0))) -ge 900 ]] && \
|
||||||
|
{ echo "Updating feeds (remote)"; newsboat -x reload; date +%s > ~/.newsboat/lastfetch.time; } || echo "No update" \
|
||||||
|
'
|
||||||
|
|
||||||
|
echo "Sync back new items"
|
||||||
|
sqlite3_rsync -v $HOST:$REMOTE $LOCAL
|
||||||
|
```
|
||||||
|
|
||||||
|
Now when invoked, it
|
||||||
|
|
||||||
|
1. syncs up the current state,
|
||||||
|
2. if at least 15 minutes (900 seconds) passed since the last fetch, fetches all feeds,
|
||||||
|
3. syncs down the database to the local machine.
|
||||||
|
|
||||||
|
```
|
||||||
|
$ newsboat-sync
|
||||||
|
Sync up READ status
|
||||||
|
sent 4,114 bytes, received 174,994 bytes, 320,407.87 bytes/sec
|
||||||
|
total size 34,131,968 speedup is 190.57
|
||||||
|
Updating feeds
|
||||||
|
Updating feeds (remote)
|
||||||
|
Sync back new items
|
||||||
|
sent 174,994 bytes, received 53,326 bytes, 253,971.08 bytes/sec
|
||||||
|
total size 34,131,968 speedup is 149.49
|
||||||
|
```
|
||||||
|
|
||||||
|
That's quite a speedup for a database that is 34 MB in size.
|
||||||
|
|
||||||
|
```
|
||||||
|
$ du -h ~/.newsboat/cache.db
|
||||||
|
34M /Users/jer/.newsboat/cache.db
|
||||||
|
```
|
||||||
|
|
||||||
|
Now I can hopefully read new blog posts offline with minimal data transfer overhead.
|
||||||
|
|
||||||
|
[newsbeuter]: https://en.wikipedia.org/wiki/Newsbeuter
|
||||||
|
[newsboat]: https://newsboat.org/
|
||||||
|
[rsync]: https://rsync.samba.org/
|
||||||
|
[sqlite-rsync]: https://sqlite.org/rsync.html
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
_Footnotes:_
|
||||||
|
|
||||||
|
[^1]: A self-hosted [Stringer](https://github.com/stringer-rss/stringer) instance.
|
||||||
|
[^2]: `rsync` would totally work just fine for this use case, but then i wouldn't get to try new tools.
|
||||||
|
[^3]: via [simonw on lobsters](https://lobste.rs/s/2ngsl1/database_remote_copy_tool_for_sqlite#c_3hlhlz)
|
Loading…
Reference in a new issue