关于使用R的数据库的建议[英] Recommendations for database with R

本文是小编为大家收集整理的关于关于使用R的数据库的建议的处理/解决方法,可以参考本文帮助大家快速定位并解决问题,中文翻译不准确的可切换到English标签页查看源文。

问题描述

我正在使用R用于使用时间序列数据运行仿真.我一直在使用数组来存储数据,但是我需要一个较少的内存密度解决方案来将数据存储在中间步骤中,以记录该过程.我不是程序员,所以如果可能的话,我正在寻找在多个平台上设置的东西(Windows,Mac,Linux).我还需要能够直接从R调用数据库,因为现在学习另一种语言是不可行的.理想情况下,我希望能够以类似于数组的方式经常读写数据库,尽管我不知道这是否现实.我会很乐意为易于使用而牺牲速度,但我愿意努力学习开源解决方案.任何建议都将不胜感激.

推荐答案

我还需要直接 从r

调用数据库

我建议使用RMYSQL接口设置MySQL.

开放DB连接后,您可以查询数据库并将数据获取到R,例如:

:

# Run an SQL statement by creating first a resultSet object
rs <- dbSendQuery(con, statement = paste(
                      "SELECT w.laser_id, w.wavelength, p.cut_off",
                      "FROM WL w, PURGE P",
                      "WHERE w.laser_id = p.laser_id",
                      "SORT BY w.laser_id")
# we now fetch records from the resultSet into a data.frame
data <- fetch(rs, n = -1)   # extract all rows

rmysql :r接口到mysql database

数据库接口和MySQL驱动程序 对于R.此版本符合 数据库接口定义为 在软件包DBI 0.2-2中实施.

mySQL数据库:

可用于您在问题中引用的所有平台,以及更多信息,下载在这里 .

其他推荐答案

快速评论:

  • r擅长于此,作为用于使用数据编程的A 语言,有很多接口
  • 有一个完整的手册专门用于数据导入/出口,并且它具有 relational Databases ,所以从那里开始.
  • r具有广泛使用的 dbi 软件包,该软件包为许多后端提供了一个统一的接口,其中包括sqlite,mysql,postgresql,oracle,...使用 rsqlite 要快速发展.您仍然可以随后切换后端.
  • 也有 rodbc ,但我觉得odbc andious with.
  • r在 tsdbi Paul Gilbert的包裹中,保罗·吉尔伯特(Paul Gilbert DBI类似于时间剧数据库的抽象.它还支持多个后端.
  • data.table 软件包是为此编写的,并且非常快.索引和聚合.

其他推荐答案

您是否真的需要一个数据库解决方案出于您的目的?您说您需要"在中间步骤中存储数据的解决方案" - 如何将数据数组在所需的时间点保存到磁盘?

编辑:为了使信息检索,您可以嵌入元信息,例如在文件名中,试用索引和/或时间戳.然后,您可以使用正确的文件名找到并加载文件.

本文地址:https://www.itbaoku.cn/post/597736.html

问题描述

I am using R to run simulations using time series data. I have been using arrays to store data but I need a less memory intensive solution for storing data at intermediate steps in order to document the process. I am not a programmer so I am looking for something relatively easy to setup on multiple platforms if possible (Windows, Mac, Linux). I also need to be able to directly call the database from R since learning another language is not feasible now. Ideally, I would like to be able to read and write frequently to the database in a manner similar to an array though I don't know if that is realistic. I will gladly sacrifice speed for ease of use but I am willing to work to learn open source solutions. Any suggestions would be appreciated.

推荐答案

I also need to be able to directly call the database from R

I suggest setting up MySQL with RMySQL interface.

Once the DB connection is open, you can query the database and get the the data into R, example:

# Run an SQL statement by creating first a resultSet object
rs <- dbSendQuery(con, statement = paste(
                      "SELECT w.laser_id, w.wavelength, p.cut_off",
                      "FROM WL w, PURGE P",
                      "WHERE w.laser_id = p.laser_id",
                      "SORT BY w.laser_id")
# we now fetch records from the resultSet into a data.frame
data <- fetch(rs, n = -1)   # extract all rows

RMySQL: R interface to the MySQL database

Database interface and MySQL driver for R. This version complies with the database interface definition as implemented in the package DBI 0.2-2.

MySQL Database:

Available for all the platforms you cited in the question, and more, download here.

其他推荐答案

Quick comments:

  • R is good at this, as a language for programming with data, there are plenty of interfaces
  • There is an entire manual devoted to data import/export, and it has a section on relational databases, so start there.
  • R has the widely-used DBI package which provides a unified interface for many backends, among them SQLite, MySQL, PostgreSQL, Oracle, ... Use that, maybe with RSQLite to get something going quickly. You can still switch backends afterwards.
  • There is also RODBC but I find ODBC tedious to work with.
  • R also has a specialised variant in the TSdbi package by Paul Gilbert which brings the DBI-alike abstraction to timeseries databases. It also supports multiple backends.
  • The data.table package was written for this and is very fast on indexing and aggregation.

其他推荐答案

Do you really need a database solution for your purpose? You say you want a "solution for storing data at intermediate steps " -- how about simply saving the data array to disk at the required time points?

Edit: to make it possible to retrieve the information, you can embed meta-information, e.g. trial index and/or timestamp, in the filename. Then later you can locate and load the file using the correct filename.