Sunday, December 17, 2017

Winter solstice challenge #2: first submission

Citation data for articles colored by
availability of a full text, as indicated
in Wikidata. Mind the artifacts.
Reproduce with this query and the
new GraphBuilder functionality.
Bianca Kramer (of 101 Innovations fame) is the first to submit results for the Winter solstice challenge and it's impressive! She has an overall score based on her own publications and the first level citations of 54%!

So, the current rankings in the challenge are as follows.

Highest Score

  1. Bianca Kramer

Best Tool

  1. Bianca Kramer

I'm sure she is more than happy if you use her tool to calculate your score. If you're patient, you may even wish to take it one level deeper.

What are you talking about??
Well, the original post sheds some idea on this, but basically scientific writing has become so dense, that a single paper does not provide enough information. But if you cannot read the cited papers, you may not be able to precisely reproduce what they did. Now that many countries are steadily heading to 100% #OpenAccess it is time to start thinking about the next step. So, is the knowledge you built on also readily available or is that still locked away.

For example, take the figure on the right-hand side: it shows when articles are published that I cited in my work (a subset, because based on data in Wikidata, using the increasing amount of I4OC data). We immediately see some indication of the availability of the cited papers. The more yellow, the more available. However, keep in mind that this is based on "full text availability" information in Wikidata, which is very sparse. That makes Bianca's approach so powerful: it uses (the equally wonderful)

You also note the immediate quality issues. Apparently, this data tells me I am citing articles from the future :) You also see that I am citing some really old articles.

Saturday, December 16, 2017

The National Plan Open Science Estafette: my own first Open Science steps

Noot: als je liever Nederlands leest, lees dan het origineel.
Earlier this year Delft hosted a meeting for Dutch scholars aimed at hearing and learning about, and to give feedback on the National Plan Open Science (doi:10.2777/061652). I'm very happy I have been able to contribute to this effort, because more and better access to knowledge is very dear to me. During lunch time everyone could demonstrate their own Open Science. From this the idea evolved to have a relay race ("estafette"). In each part of the relay someone will tell about their Open Science story. This post is te start: every next runner tells their story on what role Open Science has in their research. And it does not matter if the focus is on Open Data, Open Access, or Open Source, because the diversity in the Dutch Open Science community is just very high.

My Open Science story goes back to the time that I was studying chemistry at what is now called the Radboud Universiteit. Chemistry students could get access to the internet in 1994 and this opened a world of Open knowledge to me! Our library was well stocked, but I still had to visit research department to read certain journals. Always uncomfortable as a young student to walk into a coffee room with senior researchers.

I learned HTML and later Java. Java, with their applets, brought the internet to life. It could visualize 3D models of chemical structures. A paper journal cannot do that. Twenty years later journals still don't have this functionality, but that's not the point. In those three, four years I got introduced to three projects, each Open Source, Jmol (now JSmol), JChemPaint, and the file format "Chemical Markup Language" (CML). The first was to visualize 3D structures on the internet and the second was to visualize 2D chemical diagrams. CML was a format that could store 2D and 3D coordinates for me. But the problem was that neither Jmol nor JChemPaint could read CML.

But that's where Open Science comes in. After all, I could download the Jmol and JChemPaint source code, change it, and share that with others. That was brilliant! And I dived in. Of course, I could have just used my changes myself, but because I realized it could benefit others too, I sent my changes ("patches") to the authors of Jmol and JChemPaint. Extremely happy and proud I was when the two researchers from Germany and the U.S.A. included those patches in their version!

And in the end it was not in vain. In the final year of my chemistry study I submitted an abstract to an international conference. It got accepted! But now I had to go to Washington (Georgetown, to be precise), to talk about my work. On top of that, we agreed to meet the authors of Jmol and JChemPaint in South Bend, where we laid the foundation of a new Open Science project, the Chemistry Development Kit (CDK). Expensive trip, but fortunately I got a bursary from a Dutch company. A peculiar trip it was. We used an Amtrak sleeping train and had dinner with a soldier who served during D-Day. In New York I stepped off the sidewalk onto the street to evade a group of scary heavy boys (which turned out to be a popular boys band), and we stood in the WTC (a year before 9/11) to hear two tourists ask at the musical ticket sale desk "what broadway was?".

I am proud that I have been able to contribute to these Open Science projects and that I co-founded the CDK. The Open nature of these projects have had a significant impact and, after twenty years, still do. Sure, it's not the same is discovering a new protein or metabolite, but these projects definitely not only benefited my research. Of course, also with a huge thanks to Hens Borkent, Dick Wife, Dan Gezelter, Christoph Steinbeck, and Peter Murray-Rust.

BTW, thinking about this relay race, Open Science itself is also a relay race: you take the token of the people before you, adopt the token, and pass on the token to the next scientist. And every day the token gets brighter!

This Nationaal Plan Open Science Estafette also continues. I am delighted to pass my token to Rosanne Hertzberger. Read her story here or in Dutch.

Friday, December 15, 2017

Suggestions for ScholarlyHub

Mock Dashboard of ScholarlyHub.
(I'll assume CC-BY for this image.) 
ScholarlyHub is a open scholar profile project. I have yet no idea where this platform is going, but they planned open source nature makes me want to explore it nevertheless. The project is currently running a crowdfunding campaign and developing their plans. They asked for feedback, so here goes:

Feature requests:
  • researchers care about research, more than profiles: make things from their research ("topics") part of their profile; let them tell everyone what they are interested in
  • the website should have an API (good looks is not enough). Have you done a persona analysis? User friendly is only defined if you have defined your users.
  • make the resource FAIR: use or RDFa
  • show innovation into new scholarly activities: provide peer review functionality, etc (similar to Publons, PubPeer, PubMed Commons, etc)
  • support data and software citations
  • use identifiers (DOI, ORCID, project IDs (CORDIS, etc), etc)
  • integration of I4OC
  • freely provide #altmetrics
  • release soon, release often
  • use RSS for any bit of information on the site (one form of API, in fact)
  • integrate my social feeds into my profile (Twitter, blog, LinkedIn, etc)
You can browse my blog for other features I have recommended to websites in the past. You can also check Scholia for ideas.

New paper: "Integration among databases and data sets to support productive nanotechnology: Challenges and recommendations"

Figure 1 from the NanoImpact article. CC-BY.
The U.S.A and European nanosafety communities have a longstanding history of collaboration. On both sides there are working groups, NanoWG and WG-F (previously called WG4) of the NanoSafety Cluster. I have been chair of WG4 for about three years and still active in the group, though in the past half year, without dedicated funding, less active. That is already changing again with the imminent start of the NanoCommons project.

One of these collaborations resulted in a series of papers around data curation (see doi:10.1039/C5NR08944A and doi:10.3762/bjnano.6.189). Part of this effort was also an survey about the state of databases. A good number of databases responded to the call. It turned out non-trivial to analyse the results and write up a report around it with recommendations. The first version was submitted and rejected, and with fresh leadership, the paper underwent a significant restructuring by John Rumble and resubmitted to Elsevier's NanoImpact and now online (doi:10.1016/j.impact.2017.11.002).

The paper outlines an overview of challenges and a recommendation to the community on how to proceed. That is, basically, how should projects like eNanoMapper, caNanoLab, and Nanomaterial Registry evolve to, and what might the European Union Observatory for Nanomaterials (EUON) look like. BTW, a similar paper by Tropsha et al. was recently published the other week with a focus on the USA database ecosystem (doi:10.1038/nnano.2017.233).

Have fun reading it, and if you are working in a related field, please join either of the two aforementioned working groups! And a huge thanks to everyone involved, particular Sandra, John, and Christine.

Saturday, December 09, 2017 every house a library

Two weeks ago someone made me aware of This new social profile page allows you to digitize a list of your books. That in itself is already very useful. In the past I have made some efforts to create overviews of the books I own. If only to get some count, with respect to insurance, etc. But a very neat feature is the ability to find other people in your neighborhood with whom you may exchange books: for each book you can indicate who can see you own that book (no one, by default; if not mistaken, inheriting the ugw approach of UNIX), and if others can borrow or buy this book.

This shows users with public profiles in The Netherlands, showing the uptake has not been massive yet, but I'm hoping this post changes that :)

But the killer feature I am waiting for is a map like this, but then for books. Worldcat has a feature where it lists you the nearest library that has a copy of the book you are looking for:

And such a feature (map or list) would be brilliant for every house would potentially be a library. That idea appeals to me.

Oh, and it's Wikidata-based but that should hardly be a surprise.