The next big challenge for Tanzawa, and the last thing required for me to switch to it, is import my data from my existing Wordpress blog. There's 4 major parts to this challange:
1. Parsing the Wordpress export XML file
2. Figuring out how to map Wordpress posts to Tanzawa posts
3. Downloading and importing media
4. Rewriting existing posts to use these new asset urls and fix links.
The first step is the easiest. I've figured out the basics of it yesterday using Beautiful Soup, but will require more exploration of the various posts before I can decide how to properly map data.
The other steps are managable, but wrapped up in a 5th challenge – managing the entire import process itself. Initially I had planned on just making a command line import tool. Run the command and it does its best to import everything. But telling Tanzawa how to map categories to streams would entail complex command parameters, which I wouldn't want to use myself, let alone inflict on others.
Rather, I need a simple web interface and database tables that will let me manage and monitor the process. The basic workflow I'm imagining is something like this:
1. User uploads Wordpress export file -> Tanzawa saves it into a blob in its database along with some basic meta information about it.
2. Tanzawa will create a mapping record for each category/post/post-kind found in the file. In step 2 users will see a list of their Wordpress categories with a dropdown next to each one with the stream it should map to (not mapping is also an option).
3. Tanzawa will also provision a record for each photo and post to import. This will include its planned final permanent url, as well its existing permanent url, and will be central when rewriting content.
4. The photo records will track not only urls, but also file download status, so we don't download photos twice. There should be a page where users can see a list of all photos to import, the status, and perhaps a button to retry if it's failed.
One tricky bit will be that Tanzawa doesn't support background tasks. Which means I can either introduce them (don't really want to) or I need to find a way to control entirely by the front-end. I think a little a small Stimulus controller on the photo list page that loops through each photo and call an import api should be sufficient.
5. Once the photos have been imported there should be a big button to publish all the changes. This will be button that will actually execute the entry creations.
6. After importing is complete, all of the old Wordpress urls should automatically redirect to their new Tanzawa permalink.
Throughout this process I'll likely find data that I (should) import that I don't have a way to handle in Tanzawa - and as such I may need to create features to handle them along the way.
Thinking about how large of a task importing Wordpress properly is a bit daunting. But if I just make a little progress each day, piece by piece, I'll complete it before I know it.
This past week was spent rounding out support for checkins.
I was so focused on getting locations functioning and out the door that forgot to include microformatted data. I've now included it along with the map and added tests to ensure I don't break it in the future.
Building locations helped me figure out the best pattern for adding 1-to-1 related data to an entry. A checkin is a location and a checkin record which is the name of the venue and a url for the venue. The one limitation I built in surrounding checkins is that they must be created via a micropub request and can only be updated via the admin interface.
While it's possible to integrate with Foursquare's places api and allow people to "checkin" using Tanzawa, it's a much better experience to use Swarm app and backfeed it.
Syndications is different from locations or checkins because rather than being a 1-to1 relationship, they're a 1-to-Many relationship. i.e. a single entry can have multiple syndication urls.
Supporting multiple syndication urls from micropub is straight forward, I can just iterate over the urls and save. However, the admin interface allows me to add, update, and delete records. In addition to the form itself, it also requires a (hidden) management form to manage the number of records and so forth.
Thankfully I was able to work the pattern out so going forward, adding any other 1-toMany data for a post should be much quicker.
Along with my regular status posts, I'm going to try to make a weekly roundup post for Tanzawa. As this is more of an experiment at this time, I'm putting them in the "Articles", but I may add a weekly stream just for these posts.
I build and launched the ability to associate a location with an entry. Initially I had planned on limiting locations to check ins and statuses, but decided against building in an artificial limitation.
Location support is baked into the Tanzawa micropub endpoint as well as the RSS feeds. Posts that have a location associated with them will display the location after the author's name in the posts' byline. RSS feeds will append the location name ( or coordinates where there isn't an address) to the end of the post.
While adding the map to the public post views, I also I did some cleanup. I had originally planned on having a 3 column layout for Tanzawa: left navigation, middle content, right meta. But having it split into 3 columns felt unnatural. I removed the meta-data from the third column, though it still exists.
I also cleaned up the footer so it's stuck of the page without extending the view port beyond the natural max. Practically speaking it means that you'd always get a scrollbar even if the content length didn't warrant it. Ironic given that the footer text reads "Made with care". This text is also now styled to reduce emphasis.
Posts that belong to multiple streams will have their streams highlighted on the left. There's also a new "Home" link that takes you to the top of the site.
When I think about what I want in a map on a blog, my needs are fairly basic: posts that have a location should show a map with an indicator where the post was made and if I'm unsure of the coordinates (a guarantee) , I should to be able to search and find it on a map.
If the location too new or uncommon, it may not show up. In that scenario finding the location on the map and selecting in manually isn't large ask.
While maps are a an important point of many posts: that new coffee shop I checked in at, the location of that cool bridge in a photo I shared, or that time status I posted looking out the window of the shinkansen – they're not central or even wanted in most posts.
Sharing a thought, a checkin, or a photo is the point.
One day there might be public facing features where maps play a prominent role and are the point of a post. But until then the big maps will be reserved for when you're authoring a post and can use the extra space to pan and zoom, and on the public side, they'll be smaller and out of the way – they're not the point.
I got a good question from Adam (@TalAdam) about why I have Tanzawa on a subdomain, rather than my main domain. While the answer is simple it's a good opportunity to discuss my plans for Tanzawa in the mid-term future.
To answer Adam's question:
- 1. When I started Tanzawa I didn't have a domain (or even a name), so I started with a domain I owned.
- 2. I need to build up my minimal feature parity of my current blog before I can switch my main domain from Wordpress to Tanzawa.
What's remaining for my minimal feature parity? Only three parts: bookmarks/likes, checkins (w/ maps), and photo posts. Technically four - but photo posts are lower on the list. Once that's done I'll need to figure out how to migrate my data and create a huge redirect map for nginx. Perhaps by the end of the month?
Once I migrate my main site to Tanzawa what's left for this blog? I plan to redirect it somewhere on the tanzawa.blog domain. From there, I'll continuing using this blog as a development blog as I polish Tanzawa for a proper release that other people can use.
I realized while I've been blogging about the development of Tanzawa, I haven't talked much about my overall goals for the system. This post will dive a bit into what my goals are for the system and how I envision it working.
My main goal for Tanzawa is to make a system that makes it enjoyable and easy to blog (be it micro, photo, articles, whatever) while maintaining ownership of your content. Yes, you can achieve this Wordpress, to a degree, but it's not made for it.
David Shanske and the other devs who've made the IndieWeb plugins for Wordpress have done fantastic work. It got me back into blogging and fighting for the open web. Tanzawa is my attempt to build upon the ideas in their work and push it forward.
Sustainability & Privacy
My other main goal for Tanzawa is to bring awareness and advocate for a sustainable web. Modern computer systems waste so many resources with poor design, unoptimized media, legacy formats - you name it. Tanzawa will always strive to have the lightest impact on your server, your network, and when rendered on your computer/phone.
To those ends Tanzawa uses minimal css styled with Tailwind - no large component frameworks. Image tags are written so browsers always choose the latest file format with the best compression so less data is transferred. What's more, we don't change file formats until it's first requested, saving your server from making and your disk from storing images that will never be served. We proudly use system fonts, avoiding megabytes of downloads.
A smaller, but still important goal for Tanzawa is to build a system that respects your privacy and the privacy of your visitors. We don't include any third-party libraries or scripts that track anything about you. We also strip all location and other exif from uploaded photos.
Own Your Data
One of the promises of the of POSSE and backfeeding is that you are in control. The current tooling works, but it feels too geeky. I either need to have a custom setup with a static site (which means writing Markdown and using a command line) or use Wordpress, and then you're stuck within the confines of Wordpress.
I want to make it simple to "tweet" from your blog. To backfeed from the silos into your blog. And I want to be able to remix this data together, to make new posts and pages, rather than having it locked in posts or behind an opaque API.
The recent resurgence of interest in blogging and RSS makes me hopeful that when Tanzawa is ready to general usage there's a chance it'll make a difference.
With support for articles shipped in Tanzawa (this is the first one! 🎉 ) I'm taking a day off coding and doing a day of thinking about the next post kind(s) I want in Tanzawa: bookmarks & replies.
Why group them? Well, they're quite similar in my mind. A "reply" is a blog post that is a reply to some other page or post on the internet. On my current Wordpress site they look like a note with a link to site and maybe an excerpt for which we're replying.
Bookmarks look quite similar. Infact they're exactly the same, except I opted to put the link emoji for bookmarks.
Just looking at the two different post kinds the only difference is in the webmention that's sent. In replies there's a `u-in-reply-to` while in a bookmark there isn't. Part of the reason why they're the same is because how I treat bookmarks.
When I bookmark something publicly on my blog I've write a note as to why I'm bookmarking it. It's less of a reply and more of an aside. However, reflecting on this behavior while writing this post, I think I've been using bookmarks in this manner because I've been using Wordpress. The interface for posting bookmarks and replies (and statuses and articles) is the same and so you're encouraged to treat them the same.
Differentiating Replies and Bookmarks in Tanzawa
Both bookmarks and replies will need a field for inputting the url that we're bookmarking or replying too. And both kinds will need to dynamically load the page, extract title / author / summary information. And in both types users should be able to correct the extracted information.
I think the major difference must be made in how they're displayed in a list. Bookmarks should be displayed completely differently than other post kinds. They should only display the page title / link / date bookmarked / post permalink. Each bookmark's detail page can show the extra meta information: a note, an excerpt, and so forth.
Replies will display much like status i.e. there's no post title. But it will begin with a link and excerpt for the reply to setup the context for the post followed by our note. RSS feeds for bookmarks and replies should be the same: linked site meta info followed by a note.
My bet is that bookmarks and replies display much differently in your streams on the site that even though the publishing interface may be similar, your usage will change.