通过SQLAlchemy获取随机行[英] Getting random row through SQLAlchemy

本文是小编为大家收集整理的关于通过SQLAlchemy获取随机行的处理/解决方法,可以参考本文帮助大家快速定位并解决问题,中文翻译不准确的可切换到English标签页查看源文。

问题描述

如何使用sqlalchemy从表中选择一个或多个随机行?

推荐答案

这是一个非常特定的问题.

我知道PostgreSQL,SQLITE,MySQL和Oracle具有随机函数订购的能力,因此您可以在Sqlalchemy中使用此功能:

from  sqlalchemy.sql.expression import func, select

select.order_by(func.random()) # for PostgreSQL, SQLite

select.order_by(func.rand()) # for MySQL

select.order_by('dbms_random.value') # For Oracle

接下来,您需要限制查询所需的记录数(例如使用.limit()).

请记住,至少在PostgreSQL中,选择随机记录存在严重的灌注问题. 在这里关于它的好文章.

其他推荐答案

如果您使用的是ORM并且表不大(或者您的行缓存数量),并且您希望它是数据库独立的,那么真正简单的方法是.

import random
rand = random.randrange(0, session.query(Table).count()) 
row = session.query(Table)[rand]

这有点作弊,但这就是为什么您使用ORM.

其他推荐答案

这是四个不同的变化,从最慢到最快的排序排序. timeit底部结果:

from sqlalchemy.sql import func
from sqlalchemy.orm import load_only

def simple_random():
    return random.choice(model_name.query.all())

def load_only_random():
    return random.choice(model_name.query.options(load_only('id')).all())

def order_by_random():
    return model_name.query.order_by(func.random()).first()

def optimized_random():
    return model_name.query.options(load_only('id')).offset(
            func.floor(
                func.random() *
                db.session.query(func.count(model_name.id))
            )
        ).limit(1).all()

timeit在我的MacBook上进行10,000次运行的结果,与300行的PostgreSQL表:

simple_random(): 
    90.09954111799925
load_only_random():
    65.94714171699889
order_by_random():
    23.17819356000109
optimized_random():
    19.87806927999918

您可以轻松地看到,使用func.random()远比将所有结果返回Python的random.choice()要快得多.

此外,随着表的大小增加,order_by_random()的性能将大大降级,因为ORDER BY需要全表扫描,而optimized_random() in optimized_random()则可以使用索引.

本文地址:https://www.itbaoku.cn/post/597501.html

问题描述

How do I select one or more random rows from a table using SQLAlchemy?

推荐答案

This is very much a database-specific issue.

I know that PostgreSQL, SQLite, MySQL, and Oracle have the ability to order by a random function, so you can use this in SQLAlchemy:

from  sqlalchemy.sql.expression import func, select

select.order_by(func.random()) # for PostgreSQL, SQLite

select.order_by(func.rand()) # for MySQL

select.order_by('dbms_random.value') # For Oracle

Next, you need to limit the query by the number of records you need (for example using .limit()).

Bear in mind that at least in PostgreSQL, selecting random record has severe perfomance issues; here is good article about it.

其他推荐答案

If you are using the orm and the table is not big (or you have its amount of rows cached) and you want it to be database independent the really simple approach is.

import random
rand = random.randrange(0, session.query(Table).count()) 
row = session.query(Table)[rand]

This is cheating slightly but thats why you use an orm.

其他推荐答案

Here's four different variations, ordered from slowest to fastest. timeit results at the bottom:

from sqlalchemy.sql import func
from sqlalchemy.orm import load_only

def simple_random():
    return random.choice(model_name.query.all())

def load_only_random():
    return random.choice(model_name.query.options(load_only('id')).all())

def order_by_random():
    return model_name.query.order_by(func.random()).first()

def optimized_random():
    return model_name.query.options(load_only('id')).offset(
            func.floor(
                func.random() *
                db.session.query(func.count(model_name.id))
            )
        ).limit(1).all()

timeit results for 10,000 runs on my Macbook against a PostgreSQL table with 300 rows:

simple_random(): 
    90.09954111799925
load_only_random():
    65.94714171699889
order_by_random():
    23.17819356000109
optimized_random():
    19.87806927999918

You can easily see that using func.random() is far faster than returning all results to Python's random.choice().

Additionally, as the size of the table increases, the performance of order_by_random() will degrade significantly because an ORDER BY requires a full table scan versus the COUNT in optimized_random() can use an index.