Connect with us

Hi, what are you looking for?

Economy

The largest electronic catalog of scientific life has been completed

The largest electronic catalog of scientific life has been completed

The database can be searched in the public index based on 355 billion words, phrases and sentence excerpts from 107 million scientific articles. Publishers can still shred it.

Free Data Advertiser Launch Carl Malamud (Pictured above) Another big throw: The world’s largest free online catalog of science. General Index was established by Malamud general Listed as a non-profit organization whose largest undertaking is to publish US legal resources. same index The Internet Archive saves space.

107 million articles, 355 billion items

The dataset, which includes more than 355 billion words and sentence extracts from indexed article texts, as well as the data tables needed to identify articles, will be available from October 7. Malamud had the support of a number of eminent scholars at the head of his project Fenton J. Serville.

a sample Read about nature, the initiative is of great importance in the scientific world because researchers can create an image of a particular scientific publication even if they do not have access to its source (for example, they do not have a subscription to a particular journal, archive, etc.).

The practical importance of Malamud’s initiative was highlighted by a computer biologist at the University of Cambridge. Gitanjali Yadav He talked about giving him tremendous help in locating our appearance on the subject he was dealing with. (Yadav deals with VOCs emitted by plants and, as he said, has a great deal of information needed for his research in various publications, and with the Malamud Index he can now collect these.)

Partial Do-It-Yourself Project

See also  This year, the Romanian president traveled for exorbitant sums

The question may rightly be asked: how different is a public index from Google Scholar, which indexes literary texts paid with the consent of publishers. Malamud’s answer to that is that there, users can only access certain types of text queries, and the service also limits automated searches. For this reason, it is not suitable for performing computer analyzes that require more advanced searches.

The general index itself arose out of a project that would have allowed texts to be mined in scholarly publications without scholars having access to the text itself. The service, which launched earlier this month, is even simpler: you don’t have your own web search page, for example. If someone wants to use it, they have to create their own analysis/research software for the downloaded data. At the same time, Malamud hopes that users of the index will create open source search engines that will be shared with the scientific community.

This partial do-it-yourself solution isn’t that simple, considering that the index would require roughly 5TB of compression and 38TB of extraction. Part of the collection are spreadsheets containing nearly 20 billion keywords in processed articles, as well as article titles, authors, and digital object identifiers (DOIs).

Important question: is it legal?

According to Malamud, the Public Index does not infringe copyright, as it contains excerpts of sentences up to five words long from the articles. At the same time, of course, there is absolutely no guarantee that publishers will like this model as well, which means they can attack the practice, a legal expert told Nature.

See also  Here is Joe Biden's nominee for the presidency of the World Bank

Researcher lawyer at the University of Washington. Michael Carroll For example, he sees no impediment to global distribution of a public index, although he also cautions that copyright regulations may vary from country to country. According to Carroll, the question is whether Malamud violated the publishers’ terms by copying and manipulating the articles on which the index is based. (By the way, Malamud also admitted that in order to create the index, he had to obtain a copy of the 107 million articles processed. However, he did not reveal how he obtained them.)

Nature asked six publishers what they thought: but its own publisher, Springer Nature, was only willing to comment on the public index. They also only said they support open research initiatives, but legitimacy is important.

Click to comment

Leave a Reply

Your email address will not be published. Required fields are marked *

You May Also Like

Top News

In a harrowing incident that has shaken the community of Lewiston, Maine, a series of shootings on Wednesday evening resulted in a tragic loss...

Top News

President Joe Biden’s abrupt departure from a speech on the U.S. economy at the White House on Monday sent a ripple of speculation and...

Top News

Given the differences in styles with next-generation consoles, the so-called “console war” between Sony and Microsoft is arguably moot. Most console players, however, will...

World

Chinese scientists have discovered a little-known type of ore containing a rare earth metal highly sought after for its superconducting properties. The ore, called...

Copyright © 2024 Campus Lately.