Developers don't use the Semantic Web because they shouldn't

Discussion:

ajs6f

2018-11-22 16:17:16 UTC

I've expressed this opinion before in other venues, and it's gone over like a lead balloon, so why not again? :grin:

The "middle third" of developers don't generally use SemWeb technologies for the same reason that the "upper third" and "lower third" don't; they have no reason whatsoever to do so.

SemWeb technologies show their strength when crossing boundaries (between disciplines, between organizations, even between technical stacks or individual data sources). Most developers don't do that for a living. They work within relatively tightly-focussed areas, like building a single app for mobile phones that works off a single API, or a website that caters to one organization's users, or a management system for one business unit. RDF tooling delivers no value to such teams and costs a fortune compared with simpler approaches. Why would they use it? They shouldn't!

On this view, technical changes like bnodes for predicates or better support for list constructs aren't to the purpose. (Whether or not they are good ideas on other grounds is a different question, of course.) But to my eye this view does disclose (at least) two potential avenues towards real change:

• I know of little OLAP work that is currently done with open semantic technologies, although OLAP frequently brings together multiple sources of data and the kinds of queries that people use for that work could benefit enormously from semantic lifting. It seems to me that that could change, if the perception of poor performance and intractable constructions changed. (I'm not making any argument about the _actual_ performance of semantic web tooling, which is of course a complex question that I have rarely heard discussed usefully without specific examples. The perception, however, is pretty clearly pretty awful.) This could mean work to clarify and publicize the real potential for performance, and to improve it.

• I believe that semantic technologies might really benefit so-called "data lake" approaches in which data is quickly ingested and indexed without normalization and then transformations are applied more-or-less dynamically to query or process different sections of data together. Again, the common factor is the need to bring together disparate data sources and the immediate obstacle (or at least, _an_ immediate obstacle) is perceived performance.

To be clear, I'm in no way opposed to technical improvements! (If nothing else, as a committer for Apache Jena, I'm excited to make our own work easier and to make it easier to involve and excite others.) And as someone who (substantially) makes his living applying linked data ideas for cultural heritage and scientific research, I want these ideas to spread widely!

I see some pretty hopeful developments, like technologies that make it easer to use semantic tech in "big data" settings be they open [1] or as a service [2] or the beginnings of work on using the power of statistical methods for semantic lifting [3].

All is all, my claim is that working to get a great bulk of developers using semantic tech may not the right problem to work on. Working to get the much smaller number of developers with really on-point needs using (or able to use) semantic tech is a better task, and one for which this community is truly fitted.

---
Adam Soroka
Research Computing : Office of the CIO : the Smithsonian Institution

[1] http://sansa-stack.net/
[2] https://aws.amazon.com/neptune/
[3] http://www.semantic-web-journal.net/content/machine-learning-internet-things-semantic-enhanced-approach-1

Michael Brunnbauer

2018-11-23 10:13:03 UTC

Permalink

hi

+1 to everything Adam said.

Triples (EAV) are a well known antipattern in the world of relational databases. The situations where they actually make sense are rare. It would be a mistake to pitch RDF to the average developer without some big caveats.

Computers and Internet used to be fun. But suddenly people are doing serious stuff with them. Very serious stuff. Meanwhile the people enabling all this continue piling layer after layer on the tower in their game of Jenga. Recent events have shown that even the lowest layer of that tower cannot be trusted.

RDF is deceptively simple. You start with a simple idea and end up with a complex mess. Or as they say about EAV: "It gives you enough rope to hang yourself". I don't think this will be popular in the world of tomorrow - when the tower has fallen.

Or maybe I'm just getting old :-) Bruce Schneier thinks along the same lines - but then he is old too.

Regards,

Michael Brunnbauer

Post by ajs6f
The "middle third" of developers don't generally use SemWeb technologies for the same reason that the "upper third" and "lower third" don't; they have no reason whatsoever to do so.
SemWeb technologies show their strength when crossing boundaries (between disciplines, between organizations, even between technical stacks or individual data sources). Most developers don't do that for a living. They work within relatively tightly-focussed areas, like building a single app for mobile phones that works off a single API, or a website that caters to one organization's users, or a management system for one business unit. RDF tooling delivers no value to such teams and costs a fortune compared with simpler approaches. Why would they use it? They shouldn't!
??? I know of little OLAP work that is currently done with open semantic technologies, although OLAP frequently brings together multiple sources of data and the kinds of queries that people use for that work could benefit enormously from semantic lifting. It seems to me that that could change, if the perception of poor performance and intractable constructions changed. (I'm not making any argument about the _actual_ performance of semantic web tooling, which is of course a complex question that I have rarely heard discussed usefully without specific examples. The perception, however, is pretty clearly pretty awful.) This could mean work to clarify and publicize the real potential for performance, and to improve it.
??? I believe that semantic technologies might really benefit so-called "data lake" approaches in which data is quickly ingested and indexed without normalization and then transformations are applied more-or-less dynamically to query or process different sections of data together. Again, the common factor is the need to bring together disparate data sources and the immediate obstacle (or at least, _an_ immediate obstacle) is perceived performance.
To be clear, I'm in no way opposed to technical improvements! (If nothing else, as a committer for Apache Jena, I'm excited to make our own work easier and to make it easier to involve and excite others.) And as someone who (substantially) makes his living applying linked data ideas for cultural heritage and scientific research, I want these ideas to spread widely!
I see some pretty hopeful developments, like technologies that make it easer to use semantic tech in "big data" settings be they open [1] or as a service [2] or the beginnings of work on using the power of statistical methods for semantic lifting [3].
All is all, my claim is that working to get a great bulk of developers using semantic tech may not the right problem to work on. Working to get the much smaller number of developers with really on-point needs using (or able to use) semantic tech is a better task, and one for which this community is truly fitted.
---
Adam Soroka
Research Computing : Office of the CIO : the Smithsonian Institution
[1] http://sansa-stack.net/
[2] https://aws.amazon.com/neptune/
[3] http://www.semantic-web-journal.net/content/machine-learning-internet-things-semantic-enhanced-approach-1

--
++ Michael Brunnbauer
++ netEstate GmbH
++ Geisenhausener Straße 11a
++ 81379 München
++ Tel +49 89 32 19 77 80
++ Fax +49 89 32 19 77 89
++ E-Mail ***@netestate.de
++ https://www.netestate.de/
++
++ Sitz: München, HRB Nr.142452 (Handelsregister B München)
++ USt-IdNr. DE221033342
++ Geschäftsführer: Michael Brunnbauer, Franz Brunnbauer
++ Prokurist: Dipl. Kfm. (Univ.) Markus Hendel

Adrian Walker

2018-11-23 16:29:14 UTC

Permalink

Michael,

There is a drastic simplifcation. One can just use RDF as relational
triples and apply Apt-Blair-Walker [1] or similar semantics, as in the
examples [2]. That makes things easier for SQL programmers (of which
there are many!). It also moves institutional boundary crossings into the
application layer, where they can be more easily be explained.

Cheers, Adrian

[1] Towards a Theory of Declarative Knowledge, K. Apt, H. Blair and A.
Walker). In: Foundations of
Deductive Databases and Logic Programming, J. Minker (Ed.), Morgan Kaufman
1988.

[2] www.executable-english.com/demo_agents/RDFQueryLangComparison1.agent

Adrian Walker
Executable English LLC
San Jose, CA, USA
860 830 2085
https://www.executable-english.com

Post by Michael Brunnbauer
hi
+1 to everything Adam said.
Triples (EAV) are a well known antipattern in the world of relational
databases. The situations where they actually make sense are rare. It would
be a mistake to pitch RDF to the average developer without some big caveats.
Computers and Internet used to be fun. But suddenly people are doing
serious stuff with them. Very serious stuff. Meanwhile the people enabling
all this continue piling layer after layer on the tower in their game of
Jenga. Recent events have shown that even the lowest layer of that tower
cannot be trusted.
RDF is deceptively simple. You start with a simple idea and end up with a
complex mess. Or as they say about EAV: "It gives you enough rope to hang
yourself". I don't think this will be popular in the world of tomorrow -
when the tower has fallen.
Or maybe I'm just getting old :-) Bruce Schneier thinks along the same
lines - but then he is old too.
Regards,
Michael Brunnbauer

Post by ajs6f
I've expressed this opinion before in other venues, and it's gone over
The "middle third" of developers don't generally use SemWeb technologies

for the same reason that the "upper third" and "lower third" don't; they
have no reason whatsoever to do so.

Post by ajs6f
SemWeb technologies show their strength when crossing boundaries

(between disciplines, between organizations, even between technical stacks
or individual data sources). Most developers don't do that for a living.
They work within relatively tightly-focussed areas, like building a single
app for mobile phones that works off a single API, or a website that caters
to one organization's users, or a management system for one business unit.
RDF tooling delivers no value to such teams and costs a fortune compared
with simpler approaches. Why would they use it? They shouldn't!

Post by ajs6f
On this view, technical changes like bnodes for predicates or better

support for list constructs aren't to the purpose. (Whether or not they are
good ideas on other grounds is a different question, of course.) But to my
eye this view does disclose (at least) two potential avenues towards real

Post by ajs6f
??? I know of little OLAP work that is currently done with open semantic

technologies, although OLAP frequently brings together multiple sources of
data and the kinds of queries that people use for that work could benefit
enormously from semantic lifting. It seems to me that that could change, if
the perception of poor performance and intractable constructions changed.
(I'm not making any argument about the _actual_ performance of semantic web
tooling, which is of course a complex question that I have rarely heard
discussed usefully without specific examples. The perception, however, is
pretty clearly pretty awful.) This could mean work to clarify and publicize
the real potential for performance, and to improve it.

Post by ajs6f
??? I believe that semantic technologies might really benefit so-called

"data lake" approaches in which data is quickly ingested and indexed
without normalization and then transformations are applied more-or-less
dynamically to query or process different sections of data together. Again,
the common factor is the need to bring together disparate data sources and
the immediate obstacle (or at least, _an_ immediate obstacle) is perceived
performance.

Post by ajs6f
To be clear, I'm in no way opposed to technical improvements! (If

nothing else, as a committer for Apache Jena, I'm excited to make our own
work easier and to make it easier to involve and excite others.) And as
someone who (substantially) makes his living applying linked data ideas for
cultural heritage and scientific research, I want these ideas to spread
widely!

Post by ajs6f
I see some pretty hopeful developments, like technologies that make it

easer to use semantic tech in "big data" settings be they open [1] or as a
service [2] or the beginnings of work on using the power of statistical
methods for semantic lifting [3].

Post by ajs6f
All is all, my claim is that working to get a great bulk of developers

using semantic tech may not the right problem to work on. Working to get
the much smaller number of developers with really on-point needs using (or
able to use) semantic tech is a better task, and one for which this
community is truly fitted.

Post by ajs6f
---
Adam Soroka
Research Computing : Office of the CIO : the Smithsonian Institution
[1] http://sansa-stack.net/
[2] https://aws.amazon.com/neptune/
[3]

http://www.semantic-web-journal.net/content/machine-learning-internet-things-semantic-enhanced-approach-1
--
++ Michael Brunnbauer
++ netEstate GmbH
++ Geisenhausener StraÃe 11a
++ 81379 MÃŒnchen
++ Tel +49 89 32 19 77 80
++ Fax +49 89 32 19 77 89
++ https://www.netestate.de/
++
++ Sitz: MÃŒnchen, HRB Nr.142452 (Handelsregister B MÃŒnchen)
++ USt-IdNr. DE221033342
++ GeschÃ€ftsfÃŒhrer: Michael Brunnbauer, Franz Brunnbauer
++ Prokurist: Dipl. Kfm. (Univ.) Markus Hendel

Michael Brunnbauer

2018-11-23 16:40:24 UTC

Permalink

Hello Adrian,

even without trying to understand what you wrote I know it's a sales pitch
for your product - because you've done it so often over the past years :-)

You should keep your advertisements off the lists or they might ban you.

Regards,

Michael Brunnbauer

Post by Adrian Walker
Michael,
There is a drastic simplifcation. One can just use RDF as relational
triples and apply Apt-Blair-Walker [1] or similar semantics, as in the
examples [2]. That makes things easier for SQL programmers (of which
there are many!). It also moves institutional boundary crossings into the
application layer, where they can be more easily be explained.
Cheers, Adrian
[1] Towards a Theory of Declarative Knowledge, K. Apt, H. Blair and A.
Walker). In: Foundations of
Deductive Databases and Logic Programming, J. Minker (Ed.), Morgan Kaufman
1988.
[2] www.executable-english.com/demo_agents/RDFQueryLangComparison1.agent
Adrian Walker
Executable English LLC
San Jose, CA, USA
860 830 2085
https://www.executable-english.com