Recived comments on "20010314 Triples: The centre of the Semantic Web?"

This is an extract from a dialog between Sean B. Palmer and myself:

> "The exchange of triples between machines will be massive,
> therefore it is very important to invent an efficient conversion
> language"
>
> Hopefully this will not be too much of a problem, because at the
> bottom of it all is the humble triples format URI1(URI2, URI3); it
> should be easy for any language processor to convert to that, and then
> from that into anyo other RDF langauge. In that way, it should meak
> going from RDF1 => RDF2 as simple as a click of the mouse. In fact, it
> is already very easy to convert between the two biggest RDF languages,
> XML RDF and Notation3, but I am worried that this will put too much
> emphasis on tool-makers rather than users.

Yepp, I agree.

> "I am quite convinced that the format will be based on XML"
>
> Really? I am not at all convinced... well I was up until I discovered
> Notation3 [3], but that syntax really opened my eyes and helped me to
> see just how good the SW could become. Until Notation3, you realyl
> *had* to write in XML RDF, and it is a very clumsy langauge. Of
> course, it does have the advantage that it is in XML, but then I still
> prefer to write in Notation3 and then convert it to XML RDF on the
> fly.

This got me thinking, and I hopefully see things a bit clearer now: Lets just make things clear of why this "format" is needed. Internally, applications could use whatever format they like (as you mention in your More SW Ramblings). In order for some application to use the triples coded in some other format it needs to get converted. (Since it is not feasible to create an application that contains rules for converting any other format.) My belief is that it is easier to "teach" all applications to make these conversions if they all have some common base format that they handle and a simple conversion language that they "understand". Then, in order for an application to convert a set of triples, in a previously unknown format, it only needs to find (use a URL) the format declaration that contains the converting instructions for transforming the format into the base triple format. These conversion instructions are naturally written in the standard conversion language. This ability to do this conversion is, in my opinion really important. If this works, applications could chose a, for the particular application, optimal triple representation without closing the door to other applications (through the format declaration and conversion language).

Now, what properties should this base triple format have? To answer this questing one has to answer the questing where will this format will "live". Perhaps, the format will only be some abstract format, used especially for the conversion, i.e. not present in any files etc. This would mean that this basic triple format would not be present anywhere but only as an abstract construct inside applications doing conversions. On the other hand, if the format is constructed as e.g. XML it could be uses for other things other that only conversions. Basic applications might use this basic triple format instead of constructing a new format or using other existing ones. As I am writing this I fell the first properties of the format more appealing. And if that is how the format is used, there is not any point of choosing XML. N3 might be an option in this case. Hmm...I'm currently a bit confused right now (not seeing things so clear as I thought ;), perhaps I should stop right here and think some more. Please, give me your thought on this!

> "at least two of the elements in the triple have to be URIs"
>
> Nope, all three of the elements in a triple have to be a URI: for a
> string literal (the thing in the quote marks "") is actually just
> shorthand for a data:, URI [4]. data:, URIs are URIs that convey data
> (obviously), so they are very useful.. but people just use the string
> literal notation instead - it's easier because you don;t have to URI
> escape it (converting all of the odd characters to Unicode etc.,
> ugh!).

Yepp, I agree. You've thought me something new (to me).

> "But is there going to be a new URI scheme for exchanging
> triples, hmm I don't know, but generally one should not invent
> URI schemes if it is not absolutely necessarily."
>
> Oh no! I just made up a new URI scheme the other day for triples:
> "sem:". Although I think it is a very bad idea in the wider sense (as
> you say), it could be a good idea for person to person processing. I
> thought that one would be needed to show that "X is a triples
> format"... for example, you would have sem:URI, and then get your SW
> processor (whatever that is) to process that URI. I actually did this
> in Netscape 6: I downloaded Protozilla (which is a program that allows
> you to make your own protocols and get them to work), and got the SEM
> URIs to redirect and process. I successfully managed to get a sem: URI
> to convert from N3 to XML RDF. That isn't the only thing you could do
> to it, but it was a good example. On my homepage [5] there is a link
> to a sem: URI [6].

A sem: URI space seems quite reasonable. I've had it in my mind a few times... This space will not be used directly by humans, but applications might use this space to create higher-level information for human use. This is similar to SQL and databases. Accessing the sem: space is kind of like using SQL (well...not in the role as a query language, but access more as a access to the data). In order to be useful, higher-level information needs to be created by applications that abstract the actual accessing and data.