如何获得图书元数据?[英] How to get book metadata?

本文是小编为大家收集整理的关于如何获得图书元数据?的处理/解决方法,可以参考本文帮助大家快速定位并解决问题,中文翻译不准确的可切换到English标签页查看源文。

问题描述

我的应用程序需要根据提供的 ISBN、书名或作者检索任何已出版图书的相关信息.这并不是一个独特的要求——像 Amazon.com、Chegg.com 这样的网站,甚至像 Book Collector 这样的软件似乎都能够轻松地做到这一点.但我无法复制它.

为了澄清,我不需要搜索整个图书数据库——只搜索已输入的有限子集,就像在图书收藏中一样.该数据库将允许我用必要的元数据标记输入的书籍,以启用对该书籍子集的搜索.所以规模不是这里的问题——获取元数据才是.

我尝试过的选项是:

  1. 抓取亚马逊.抓取常规亚马逊页面对于缺少作者之类的问题不是很健壮,虽然抓取较小的移动页面更快,但它们在提取的鲁棒性方面存在相同的问题.此外,将其构建到应用程序中显然违反了亚马逊的服务条款.
  2. 刮掉国会图书馆.虽然这似乎具有较少的法律后果,但易用性和稳健性再次成为问题.
  3. ISBNdb.com API. 虽然该服务在一定程度上是免费的,并且可以很好地返回必要的元数据,但我每天需要为 500 多本书执行此操作,在这一点上,这项服务的成本与使用成正比.我更喜欢免费或一次性付款解决方案,让我也能做到这一点.
  4. Google Book Data API.虽然这似乎提供了我需要的信息,但我无法按照服务条款的要求显示图书预览.
  5. 购买图书数据库的许可. 例如,Ingram 或 Baker &Taylor 将这些目录提供给零售商和图书馆.这个解决方案显然很昂贵,所以我希望有一个我错过的更优雅的解决方案.但如果不是,并且 SO 上的某个人在特定数据库方面有很好的经验,我愿意接受.

我已尝试详细描述我的方法,以便其他书籍较少的人可以利用上述解决方案.但鉴于我的要求,我在检索图书元数据方面束手无策,因此非常感谢任何指针.

推荐答案

由于您不太可能每天检索相同的 500 本书:将从 isbndb.com 检索到的数据存储在数据库中,并通过以下方式填充 book书.

本文地址:https://www.itbaoku.cn/post/597415.html

问题描述

My application needs to retrieve information about any published book based on a provided ISBN, title, or author. This is hardly a unique requirement---sites like Amazon.com, Chegg.com, and even software like Book Collector seem to be able to do this easily. But I have not been able to replicate it.

To clarify, I do not need to search the entire database of books---only a limited subset which have been inputted, as in a book collection. The database would simply allow me to tag the inputted books with the necessary metadata to enable search on that subset of books. So scale is not the issue here---getting the metadata is.

The options I have tried are:

  1. Scrape Amazon. Scraping the regular Amazon pages was not very robust to things like missing authors, and while scraping the smaller mobile pages was faster, they shared the same issues with robustness of extraction. Plus, building this into an application is a clear violation of Amazon's Terms of Service.
  2. Scrape the Library of Congress. While this seems to have fewer legal ramifications, ease and robustness were again issues.
  3. ISBNdb.com API. While the service is free up to a point, and does a good job of returning the necessary metadata, I need to do this for over 500 books on a daily basis, at which point this service costs money proportional to use. I'd prefer a free or one-time payment solution that allows me to do the same.
  4. Google Book Data API. While this seems to provide the information I need, I cannot display the book preview as their terms of service requires.
  5. Buy a license to a database of books. For example, companies like Ingram or Baker & Taylor provide these catalogs to retailers and libraries. This solution is obviously expensive, so I'm hoping that there's a more elegant solution I've missed. But if not, and someone on SO has had a good experience with a particular database, I'm willing to go with that.

I've tried to describe my approach in detail so others with fewer books can take advantage of the above solutions. But given my requirements, I'm at my wits' end for retrieving book metadata, so any pointers are greatly appreciated.

推荐答案

Since it is unlikely that you have to retrieve the same 500 books every day: store the data retrieved from isbndb.com in a database and fill it up book by book.