Semantic MediaWiki

From iGeek
20101031 SemanticMediaWiki Logo.svg
Semantic MediaWiki (SMW) helps organize and retrieve data in a Wikipedia-like website.
Semantic MediaWiki is a tool to convert unstructured data (articles) into having more structure -- so that you can do things like find like articles, inter-relate them (or their content), or create tables made from data-fragments contained in many articles.
ℹ️ Info          
~ Aristotle Sabouni
Created: 2021-07-26 

Order from Chaos[edit | edit source]

The english language is not very structured. There are some rules, but many exceptions and need to have a deep understanding of context and intent to gleen what is meant. It's basically just a pile of words (that can be said in many orders) in a stream of consciousness, in order to get a point across. And like this paragraph, it can be somewhat hard to parse... especially for computers.

What Semantic MediaWiki (or other sematic tools) can do, is they allow you to tag key fragments of data -- to give little hints as to context for the computer/software. That allows you get create new articles, or glean insights from all the data fragments you've already added -- without having to do it manualy.

That's a bit abstract, so let's go a litte more real world: I use a bunch of quotes on the site, and I create a page for each. Normally, that's just a bunch of page quotes that you can search for. But with Semantic hinting, I noted the author, the citation (where and when it was done), and some tags on what topics it belongs to. That means that if I create an author page, all the quotes (and books, articles, etc) by that author, can be gathered up. The same by date, or by a particular topic. The tags, parameters allow many more ways to find or display the same information. Instead of someone having to manually have to create one index for every possible way the reader might want to find things, the hints allow the computer to automate it.

So SMW turns the chaos of freeform articles, into the order of semi-structured data. Of course I have to add the structure (categorization), and add the data (keys and their values) to every article. But SMW lets me extract those nuggets that I put in there -- both from my site and other sites that use SMW.

Others[edit | edit source]

🗒️ Note:
I'd done the first Wiki version of this site, using DPL (and LST) plug-ins -- which allowed me to put some structured data in, and did a lot of the organization. But display of results was limited. And the structures were inherently fragile. In coding lingo, I'd say it was more like procedural programming than object oriented -- if you named everything right, it could work similarly. But it could be fragile, and hard to debug when something went wrong.

Are there other data organizers? Sure. Using Wikipedia's engine (called MediaWiki) there are the following:

  • SMW - The most full featured, with the largest support, most features, and is widely used -- with a lot of German and European enthusiasm keeping it alive.
  • Cargo - A lighter weight and simpler to understand version, created by a legend in the community: Yaron Koren. But it doesn't quite have the following/scale
  • DPL3 - (Dynamic Page List 3) - this is basically a way to grab fragments from other pages. But with a judicous use of consistency in naming, you can do a lot.
  • And a few like WikiDB and Wikibase - that more allow you to retrieve data from tables that you manage... instead of allowing you to use articles to structure information.

That's before looking at other website engines or content management systems.

Some of the concepts of structured data aren't that hard. Like tagging to find content is widely used in many apps. Putting it together to dynamically design pages? in that, you've narrowed the pool to a lot fewer contenders -- and the vast majority of them are designed in a way that you must structure the data the way they want, and display the results with the same limitations. SMW and MediaWiki are basically more like giving you an engine and letting you build the rest of the car... instead of giving you a car and just allowing you to pick paint color. If you like the car they delivered, the latter is great. If you need to design things that no other car has... then an off-the-shelf car, with a few options and colors, just won't be enough.

Views[edit | edit source]

I have a love/hate relationship going with SMW.

Love[edit | edit source]

  • The architecture is interesting, and it feels very OOD (Object Oriented), you can add Parameters, and Concepts and so on. It's hard to explain but it feels like it had thought behind it's design.
  • There are a ton of additions/extensions/plug-ins and features, many ways to output data you've put into it, and it once had a rich/thriving developer community
  • The community is pretty great. When you talk to the people they will often answer questions, are helpful, nice, knowledgable. And the community that likes it are all good people.

Hate[edit | edit source]

  • It's buggy. A lot of them are subtle and things you can work around. But there's more bugs than developers so it's often easier to work around them. Some are pretty foundational or been there a long time.
    • From basic output formats not working with images correctly/reliably.
    • Spaces in external URL's cause the solution to be unreliable (for those URLS) -- solution, just put things through you own template that encodes as html <a href="URL">Link Name</a>
    • If you create a category and redirect to a page, the category isn't in SMW Datastore, so you can't find it.
    • Once you rebuild the database (and it gets quirky enough to where you might want/need to), there's no easy way to rebuild it, except visit every page and refresh (a couple times). If you were using SMW for something like "Categories="... then I can't find the pages to refresh them -- since root queries were based on category.
    • Adding a parameter doesn't seem to work. You need to do some jobs to get it to work. But it resolves extremely slowly for most people.
  • It's not that performant. If you start inter-relating lots of many to many relationships (exactly what it's sort of there for), is painful to add the parameters, and then doing the page refreshes isn't that fast. For Corporate/Enterprise level solutions, this is fine. But if you want a high volume consumer site that can scale easily? There's more of an issue.
  • You need to know/do a lot of DevOps and deployment stuff to keep it working.

Alternatives[edit | edit source]

As I've gotten into SMW more and more -- I've learned that Cargo is probably a better solution for my needs. (I think... as I haven't ported over to it yet). While it doesn't feel as OOD, and it doesn't support some constructs that I like (like SubObjects), and it's a lower level abstraction on SQL database. It's more performant, and the less abstraction may make it just perform better and fit the problem domain better, instead of being designed for design purity.

I was using SMW to do things like gather all the listings of a certain type using queries (ask) in a template. But the truth is, while it's more simpler steps to do it, it's probably better to just have a query page in one place. Then in other places include that fragment. That fragment is cached and much faster than reduntant queries. So if you are using more transclusion of cached pages, over embedding of querties, then the advantages of SMW over Cargo or DPL is much lower.

GeekPirate.small.png



🔗 More

About
Writing gives me peace. It gives me an excuse to research and question and refine. And once written down, move on.

Tech Reviews
A list of Technology Reviews.


🔗 Links

Tags: About  Tech Reviews

Cookies help us deliver our services. By using our services, you agree to our use of cookies.