I've got the base interface for mapping categories to streams worked out. Adding a new stream directs you to the django admin, which isn't ideal from a user perspective, I kinda like it because modifying system data should be different.
Have got the base upload form working. After uploading the file it's automatically creating placeholder records for all post formats (Wordpress' build in Post Kind), Categories, and Post Kinds.
But thinking more about the actual worflow, I think it will be better not automatically create those records and truly split it separate steps. So after uploading the file you're taken to a list page with a list of all uploaded Wordpress files.
Next to each filename there will be five buttons: "Set Category Mapping", "Set Post Format Mapping" , "Set Post Kind Mapping" (if found), "Import Media", and "Import Posts". The "Import Posts" button will be disabled until mapping has been setup and media has been imported.
Uploading the file will automatically redirect you to the "Set Category Mapping" page, but if you leave the process midway through you'll be able to pickup where you left off.
The Small Web is for people (not startups, enterprises, or governments). It is also made by people and small, independent organisations (not startups, enterprises, or governments). On the Small Web, you (and only you) own and control your own home (or homes).This is exactly what motivates me to work and build Tanzawa. The world needs a smaller web focused on people. The Small Tech principles are also bang on.
Getting the base models used for importing Wordpress posts into Tanzawa built has made the task feel a bit less daunting. There's a clear path forward.
The next big challenge for Tanzawa, and the last thing required for me to switch to it, is import my data from my existing Wordpress blog. There's 4 major parts to this challange:
1. Parsing the Wordpress export XML file
2. Figuring out how to map Wordpress posts to Tanzawa posts
3. Downloading and importing media
4. Rewriting existing posts to use these new asset urls and fix links.
The first step is the easiest. I've figured out the basics of it yesterday using Beautiful Soup, but will require more exploration of the various posts before I can decide how to properly map data.
The other steps are managable, but wrapped up in a 5th challenge – managing the entire import process itself. Initially I had planned on just making a command line import tool. Run the command and it does its best to import everything. But telling Tanzawa how to map categories to streams would entail complex command parameters, which I wouldn't want to use myself, let alone inflict on others.
Rather, I need a simple web interface and database tables that will let me manage and monitor the process. The basic workflow I'm imagining is something like this:
1. User uploads Wordpress export file -> Tanzawa saves it into a blob in its database along with some basic meta information about it.
2. Tanzawa will create a mapping record for each category/post/post-kind found in the file. In step 2 users will see a list of their Wordpress categories with a dropdown next to each one with the stream it should map to (not mapping is also an option).
3. Tanzawa will also provision a record for each photo and post to import. This will include its planned final permanent url, as well its existing permanent url, and will be central when rewriting content.
4. The photo records will track not only urls, but also file download status, so we don't download photos twice. There should be a page where users can see a list of all photos to import, the status, and perhaps a button to retry if it's failed.
One tricky bit will be that Tanzawa doesn't support background tasks. Which means I can either introduce them (don't really want to) or I need to find a way to control entirely by the front-end. I think a little a small Stimulus controller on the photo list page that loops through each photo and call an import api should be sufficient.
5. Once the photos have been imported there should be a big button to publish all the changes. This will be button that will actually execute the entry creations.
6. After importing is complete, all of the old Wordpress urls should automatically redirect to their new Tanzawa permalink.
Throughout this process I'll likely find data that I (should) import that I don't have a way to handle in Tanzawa - and as such I may need to create features to handle them along the way.
Thinking about how large of a task importing Wordpress properly is a bit daunting. But if I just make a little progress each day, piece by piece, I'll complete it before I know it.
This past week was spent rounding out support for checkins.
I was so focused on getting locations functioning and out the door that forgot to include microformatted data. I've now included it along with the map and added tests to ensure I don't break it in the future.
Building locations helped me figure out the best pattern for adding 1-to-1 related data to an entry. A checkin is a location and a checkin record which is the name of the venue and a url for the venue. The one limitation I built in surrounding checkins is that they must be created via a micropub request and can only be updated via the admin interface.
While it's possible to integrate with Foursquare's places api and allow people to "checkin" using Tanzawa, it's a much better experience to use Swarm app and backfeed it.
Syndications is different from locations or checkins because rather than being a 1-to1 relationship, they're a 1-to-Many relationship. i.e. a single entry can have multiple syndication urls.
Supporting multiple syndication urls from micropub is straight forward, I can just iterate over the urls and save. However, the admin interface allows me to add, update, and delete records. In addition to the form itself, it also requires a (hidden) management form to manage the number of records and so forth.
Thankfully I was able to work the pattern out so going forward, adding any other 1-toMany data for a post should be much quicker.
Parsing the Wordpress XML file with feedparser strips all of the Wordpress specific data. But it looks like I can use BeautifulSoup (which I'm using elsewhere) to get what I need. The "xml" parser preserves the CData, so I can get the encoded data, too. Progress.
Shipped full support for checkins and syndication urls . Yay! 🎉 Next up is support for importing posts from Wordpress...😬
I got the web interface working for adding and removing of syndication urls. Nice and simple. Once I've got them marking up properly, I think they're ready to ship!
What’s cool about this: you can watch for mentions of whatever you want, and those come to you in the same app where your other feeds live.Following Twitter searches in NetNewsWire looks super handy. Will have to add the #IndieWeb hashtag once this is released.