`

Schemaless

 
阅读更多
“Schemaless”

In the NoSQL world it is common to talk about schemaless databases or data models.

It would be more precise to say “dynamic schema”.  In MongoDB, there are databases; a system catalog of collections; documents within collections; explicitly declared indexes for a collection.  The big difference is that “columns”, or rather fields in the document data model, are not predeclared.  Each field/value in the document is dynamic and can be present or missing.  Each value has a datatype too, so it isn’t typeless but rather dynamic or what some might call duck typing.

Here’s an example in the mongo shell.  We may have a couple docs:

> db.persons.find()
{ “name” : “jane”, “age” : 25 }
{ “name” : “ben”, “age” : 30 }

We could then add a new person with an extra attribute:

> db.persons.insert({name:’julie’,age:28,likes:’baseball’})
> db.persons.find()
{ “name” : “jane”, “age” : 25 }
{ “name” : “ben”, “age” : 30 }
{ “name” : “julie”, “age” : 28, “likes” : “baseball” }

No “alter table” necessary.  This is very helpful with agile development methodologies. 

We can take it a step further however.  The value of a field need not be consistent from document to document.  Now, in practice, it is very very common for the contents of a collection to be homogeneous.  But we have the option.  For example suppose we want to add “likes” for ben, but ben likes a couple things.  What to do?

> db.persons.update({name:’ben’},{$set:{likes:[‘math’,’baseball’]}})
> db.persons.find()
{ “name” : “jane”, “age” : 25 }
{ “name” : “julie”, “age” : 28, “likes” : “baseball” }
{ “name” : “ben”, “age” : 30, “likes” : [ “math”, “baseball” ] }

In this example, things work out particularly elegantly as even though one likes value is an array, and the other a string, we can still do some queries across them that are interesting.  This is because when querying for a value, if the value is an array, MongoDB looks into the array:

> db.persons.find({likes:’baseball’})
{ “name” : “julie”, “age” : 28, “likes” : “baseball” }
{ “name” : “ben”, “age” : 30, “likes” : [ “math”, “baseball” ] }

Likewise we can index the field:

> db.persons.ensureIndex( { likes : 1 } )

All very handy and useful.  But you might ask “won’t my data get rather dirty with no schema constraints?”  I had this concern when we started; I assumed we would just add some constraint rules later when needed.  Oddly, there hasn’t been a lot of demand for the feature, so far.  Empirically, it seems the data doesn’t get too noisy.

One other very important note: the dynamic schema is not just for developer friendliness!  There is another good reason for it.  Imagine changing the schema in a database cluster involving 2,000 servers.  It might be tricky to change that global state globally in a consistent manner.  One goal here is to store very big data sets.  Alter table is probably not going to fly with billions or trillions of documents.

P.S. For compactness, the examples above do not show the _id field MongoDB or its driver automically adds to all documents.

P.P.S. Dynamic schema is not unique to MongoDB — some other products in the space do it too…of course I’m biased this is my favorite.

分享到:
评论

相关推荐

    schemaless的类sql分布式查询系统

    schemaless的类sql分布式查询系统 schemaless的类sql分布式查询系统 schemaless的类sql分布式查询系统 schemaless的类sql分布式查询系统 schemaless的类sql分布式查询系统 schemaless的类sql分布式查询系统 ...

    go-schemaless:基于Uber的Schemaless的开源分片数据库框架

    这是MIT许可的Uber Schemaless(不变的BigTable样式分片MySQL / Postgres)的开源实现。 将其视为您自己的分片数据存储API和基础结构的潜在构建块。 github问题列表描述了有意保留的未实现的内容,以及该实现与Uber...

    schemaless-graphql-neo4j:将无类型和动态GraphQL查询转换为Cypher

    schemaless-graphql-neo4j 将无类型的动态GraphQL查询转换为Cypher。 签出,以更好地查看您可以编写的查询。入门$ npm install schemaless-graphql-neo4j :warning: 图书馆尚未发布操场您可以开始使用开发人员游乐场...

    Pentaho Analytics for MongoDB Cookbook(PACKT,2015)

    MongoDB is an open source, schemaless NoSQL database system. Pentaho as a famous open source Analysis tool provides high performance, high availability, and easy scalability for large sets of data. ...

    influxdb2-2.2.0.x86_64; influxdb2-client-2.3.0-linux-amd64.tar

    InfluxDB服务端和客户端最新下载,主要是centos系统环境; 官网下载不太方便,下载下来后方便大家使用 ---- InfluxDB是一个由...schemaless(无结构),可以是任意数量的列 Scalable可拓展 一系列函数,方便统计

    CnosDB 是一个具有高性能、高压缩比和高可用性的开源分布式时间序列数据库

    支持 schemaless ("无模式")的写入方式,支持历史数据补录(含乱序写入)。云原生: CnosDB 有原生的分布式设计、数据分片和分区、存算分离、Quorum 机制、Kubernetes 部署和完整的可观测性,具有最终一致性,能够...

    influxdb-1.7.1_windows_amd64.zip

    同时,它有以下几大特点: schemaless(无结构),可以是任意数量的列; min, max, sum, count, mean, median 一系列函数,方便统计; Native HTTP API, 内置http支持,使用http读写; Powerful Query Language 类似...

    InfluxDB-1.2.4 Windows x64

    schemaless(无结构),可以是任意数量的列; min, max, sum, count, mean, median 一系列函数,方便统计; Native HTTP API, 内置http支持,使用http读写; Powerful Query Language 类似sql; Built-in Explorer ...

    InfluxDB-1.1.0 Windows x64

    schemaless(无结构),可以是任意数量的列; min, max, sum, count, mean, median 一系列函数,方便统计; Native HTTP API, 内置http支持,使用http读写; Powerful Query Language 类似sql; Built-in Explorer ...

    influxdb-1.5.2_windows_amd64.zip

    schemaless(无结构),可以是任意数量的列; min, max, sum, count, mean, median 一系列函数,方便统计; Native HTTP API, 内置http支持,使用http读写; Powerful Query Language 类似sql; Built-in Explorer ...

    influxdb-1.2.4_windows64位

    schemaless(无结构),可以是任意数量的列; min, max, sum, count, mean, median 一系列函数,方便统计; Native HTTP API, 内置http支持,使用http读写; Powerful Query Language 类似sql; Built-in Explorer ...

    InfluxDB-1.0.2 Windows x64

    schemaless(无结构),可以是任意数量的列; min, max, sum, count, mean, median 一系列函数,方便统计; Native HTTP API, 内置http支持,使用http读写; Powerful Query Language 类似sql; Built-in Explorer ...

    influxdb-1.5.2_windows_amd64

    schemaless(无结构),可以是任意数量的列; min, max, sum, count, mean, median 一系列函数,方便统计; Native HTTP API, 内置http支持,使用http读写; Powerful Query Language 类似sql; Built-in Explorer ...

    Elasticsearch Blueprints

    Elasticsearch is a distributed search server similar to Apache Solr with a focus on large datasets, schemaless setup, and high availability. Utilizing the Apache Lucene library (also used in Apache ...

    schemaless:这是活动记录的字段和索引!!

    无模式对于ActiveRecord / PostgreSQL 实验性错误代码,请用于提交错误/请求。 还需要更多的测试代码和方案。 class Bike < ActiveRecord :: Base # field column name, column type, null: limit: precision: ...

    laravel-schemaless-attributes:向雄辩模型添加无模式属性

    向雄辩模型添加无模式属性 如果您可以在Eloquent中使用NoSQL的精神,那会很酷吗? 这个包就是这样做的。 它提供了一种特征,当将其应用于模型时,可以将任意值存储在单个JSON列中。 这里有一些例子。...

    true_schemaless_elasticsearch:真正的无模式Elasticsearch实现

    Elasticsearch的真正无模式实现应用程序先决条件: 确保使用npm安装了node.js 遵循的步骤: git克隆项目导航到项目目录运行以下命令 npm install sudo npm install sails -g sails lift默认情况下,服务器将在端口...

    程序员为什么还要刷题-schemaless-benchmarks:无模式数据序列化库的基准

    程序员常刷题介绍 这是一个基准测试套件,用于测试无模式数据序列化格式的解析器的性能。 基准测试套件目前支持: JSON / BSON 消息包 JSON格式 宾 免责声明:此基准测试由 ...基准测试分为三个测试类别:编码器、增量...

    Uber为什么从Postgres迁移到MySQL,减少频宽占用、内存占用,提高操作效率

    特别是在之前一些使用Postgres的案例中,现在则改用Schemaless(一个基于MySQL的全新数据库分片)。本文将探索Postgres的缺陷,解释迁移到MySQL的基础上构建Schemaless和其它后端服务的原因。 Postgres有很多

    elasticsearch-6.8.3.rpm

    Elasticsearch is a full... It is easy to scale, schemaless, and near real time, and provides a restful interface for different operations. It is schemaless, and it uses inverted indexes for data storage.

Global site tag (gtag.js) - Google Analytics