发布新帖

検索

公告
· 九月 16, 2022

InterSystems extension for Docker Desktop is now published

Great news, the extension to Docker Desktop for InterSystems Container Registry is now publicly available for everyone.

It is already available in the marketplace in Docker Desktop. It was published today, and it requires restarting Docker Desktop to see it.

Feel free to post any feedback in the GitHub repository, here

讨论 (0)1
登录或注册以继续
文章
· 九月 13, 2022 阅读大约需 8 分钟

CI/CD with IRIS SQL

In the vast and varied SQL database market, InterSystems IRIS stands out as a platform that goes way beyond just SQL, offering a seamless multimodel experience and supporting a rich set of development paradigms. Especially the advanced Object-Relational engine has helped organizations use the best-fit development approach for each facet of their data-intensive workloads, for example ingesting data through Objects and simultaneously querying it through SQL. Persistent Classes correspond to SQL tables, their properties to table columns and business logic is easily accessed using User-Defined Functions or Stored Procedures. In this article, we'll zoom in on a little bit of the magic just below the surface, and discuss how it may affect your development and deployment practices. This is an area of the product where we have plans to evolve and improve, so please don't hesitate to share your views and experiences using the comments section below.

⚠ This part of the product is changing significantly with IRIS 2025.2, please see this article for more detail. The information below applies to 2025.1 and earlier releases.

 

Saving the Storage Definition

Writing brand new business logic is easy, and assuming you have well-defined APIs and specifications, adapting or extending it usually is too. But when it's not just business logic, but also involves persistent data, anything you change from the initial version will need to be able to cope with data that was ingested through that earlier version.

On InterSystems IRIS, data and code coexist in a single high-performance engine, without the half dozen abstraction layers you might see in other 3GL or 4GL programming frameworks. This means there's only a very thin and transparent mapping to translate your class' properties to $list positions in a global node per row of data when using default storage. If you add or remove properties, you don't want the data from a removed property to show up under a new property. This mapping of your class' properties is what the Storage Definition takes care of, a somewhat cryptic XML block you may have noticed at the bottom of your class definition. The first time you compile a class, a new Storage Definition gets generated based on the class' properties and parameters. When you make changes to your class definition, at recompile time those changes are reconciled with the existing Storage Definition and it gets amended such that it remains compatible with existing data. So while you are going out on a limb refactoring your classes, the Storage Definition carefully considers your earlier creativity and ensures both old and new data remain accessible. We call this schema evolution.

In most other SQL databases the physical storage of your tables is much more opaque if visible at all, and changes can only be made through ALTER TABLE  statements. Those are standard DDL (Data Definition Language) commands, but typically much less expressive than what you can achieve modifying a class definition and procedural code on IRIS directly. 

At InterSystems, we strive to offer IRIS developers the ability to cleanly separate their code and data, as that is crucial to ensure smooth packaging and deployment of your applications. The Storage Definition plays a unique role in this, as it captures how the one maps to the other. That's why it's worth a closer look in the context of general development practices and CI/CD pipelines in particular.

Exporting to UDL

In the current century, source code management is file-based, so let's first take a look at IRIS' main file export format. The Universal Description Language is, as its name suggests, meant to be a universal file format for any and all code you write on InterSystems IRIS. It is the default export format when working with the VS Code ObjectScript plug-in and leads to easy-to-read files that resemble almost exactly what you'd see in an IDE, with an individual .cls file for each class (table) in your application. You can use $SYSTEM.OBJ.Export() to create UDL files explicitly, or just leverage the VS Code integration.

From the days of Studio, you may also remember an XML format that captured the same information as UDL and allowed grouping multiple classes into a single export. While that last part is convenient in some scenarios, it's a lot less practical to read and track differences across versions, so we'll ignore it for now.

Because UDL is meant to capture everything IRIS can express about a class, it will include all elements of a class definition, including the full Storage Definition. When importing a class definition that already includes a Storage Definition, IRIS will verify whether that Storage Definition covers all properties and indices of the class and if that is the case, just take it as-is and overwrite the previous Storage Definition for that class. This makes UDL a practical format for version management of classes and their Storage Definition, as it preserves that backwards compatibility for data ingested through prior versions of the class, wherever you deploy it to. 

If you are a hardcore developer, you may wonder whether these Storage Definitions keep growing and whether this "baggage" needs to be carried around indefinitely. The purpose of Storage Definitions is to preserve compatibility with pre-existing data, so if you know there isn't any of that and you want to get rid of lengthy genealogy, you can "reset" your Storage Definition by removing it from the class definition and have the class compiler generate it anew. For example, you may use this to take advantage of newish best practices such as the use of Extent Sets, which implement hashed global names and separate each index into their own global, improving low-level efficiencies. For backwards compatibility within customer applications, we cannot universally change such defaults in the %Persistent superclass (though we will apply them when creating a table from scratch using the CREATE TABLE  DDL command), so a periodic review of your classes and their storage is worthwhile. It is also possible to edit the Storage Definition XML directly, but users should exercise extreme caution as this may render existing data inaccessible.

So far so good. Storage Definitions offer a smart mapping between your classes and automatically adapt as the schema evolves. What else is in there?

Static vs Stats?

As you probably know, the InterSystems IRIS SQL engine makes advanced use of table statistics to identify the optimal query plan for any given statement the user executes. Table statistics include metrics on the size of a table, how values are distributed within a column, and much more. This information helps the IRIS SQL optimizer to decide which index is most beneficial, in what order to join tables, etcetera, so intuitively the more up-to-date your statistics are, the better chances you have to optimal query plans. Unfortunately, until we introduced fast block sampling in IRIS 2021.2, collecting accurate table statistics used to be a computationally expensive operation. Therefore, when customers deploy the same application to many environments in which the data patterns were largely the same, it made sense to consider the table statistics part of the application code and include them with the table definitions.

This is why on IRIS today you'll find the table statistics embedded inside the Storage Definition. When collecting table statistics through a manual call to TUNE TABLE or implicitly by querying it (see below), the new statistics are written to the Storage Definition and existing query plans for this table are invalidated so they can take advantage of the new statistics upon the next execution. Because they are part of the Storage Definition, these statistics will be part of UDL class exports and therefore may end up in your source code repository. In the case of carefully vetted statistics for a packaged application, this is desirable, as you'll want these specific statistics to drive query plan generation for all the application deployments.

Starting with 2021.2, IRIS will automatically collect table statistics at the start of query planning when querying a table that doesn't have any statistics at all and is eligible for fast block sampling. In our testing, the benefits of working with up-to-date statistics rather than no statistics at all clearly outweighed the cost of on-the-fly statistics gathering. For some customers however, this has had the unfortunate side effect that statistics gathered automatically on the developer's instance end up in the Storage Definition in the source control system and eventually in the packaged application. Obviously, the data in that developer environment and therefore the statistics on it may not be representative for a real customer deployment and lead to suboptimal query plans.

This scenario is easy to avoid. Table statistics can be excluded from the class definition export by using the /exportselectivity=0 qualifier when calling $SYSTEM.OBJ.Export() , or use the /importselectivity flag value of your choice to avoid importing them along with the code. The system default for these flag can be found here, and changed using $SYSTEM.OBJ.SetQualifiers(...). You can then leave it up to the automatic collection in the eventual deployment to pick up representative stats, make explicit statistics collection part of your deployment process, which will overwrite anything that might have been packaged with the application, or manage your table stats separately through their own import/export functions: $SYSTEM.SQL.Stats.Table.Export() and Import().  

We're currently working on a project to move table statistics to live with the data rather than be part of the code, and differentiate more cleanly between any fixed statistics configured explicitly by a developer and those collected from the actual data. Also, we're planning more automation with respect to periodically refreshing those statistics, based on how much the table data changes over time. 

Wrapping up

In this article, we've outlined the role of a Storage Definition in IRIS ObjectRelational engine, how it supports schema evolution and what it means to include it in your source control system. We've also described why table statistics are currently stored in that Storage Definition and suggested development practices for making sure your application deployments end up with statistics that are representative for the actual customer data. As mentioned earlier, we're planning to further enhance these capabilities so we look forward to your feedback on the current and planned functionality and refine our design as appropriate.

6 Comments
讨论 (6)4
登录或注册以继续
问题
· 九月 11, 2022

BuildKit does not work in Online demo

Hello, Community, I tried to deploy online demo for interoperability-test app. I see it errored because BuildKit is not available. Can the Docker be configured to support BuildKit?

3 Comments
讨论 (3)3
登录或注册以继续
文章
· 九月 11, 2022 阅读大约需 16 分钟

Sustainable Machine Learning for the InterSystems Interoperability Contest

Hello everyone, I’m a French student that just arrived in Prague for an academical exchange for my fifth year of engineering school and here is my participation in the interop contest.

I hadn’t much time to code since I was moving from France to Prague and I’m participating alone, so I decided to make a project that’s more like a template rather than an application.

4 Comments
讨论 (4)2
登录或注册以继续
文章
· 八月 30, 2022 阅读大约需 3 分钟

使用Python DB-API连接InterSystems IRIS

InterSystems IRIS 允许从任何符合DB-API的Python应用程序对InterSystems IRIS 进行快速、无缝地访问。Python DB-API驱动是对PEP 249 v2.0(Python数据库API规范 v2.0)的完整兼容。

 

步骤

  1. 前提是要有一个Python的开发环境。 本示例使用vs code 如下所示创建一个dbapi.py文件。 dbapi.py :
    # Embedded Python examples from summer 2022
    import iris as dbapi
    ​
    mytable = "mypydbapi.test_things"
    conn = dbapi.connect(hostname='localhost', port=1972, namespace='IRISAPP', username='superuser', password='iris')
    ​
    # Create table
    cursor = conn.cursor()
    try:
     cursor.execute(f"CREATE TABLE {mytable} (myvarchar VARCHAR(255), myint INTEGER, myfloat FLOAT)")
    except Exception as inst:
     pass
    cursor.close()
    conn.commit()
    ​
    # Create some data to fill in
    chunks = []
    paramSequence = []
    for row in range(10):
     paramSequence.append(["This is a non-selective string every row is the same data", row%10, row * 4.57292])
     if (row>0 and ((row % 10) == 0)):
       chunks.append(paramSequence)
       paramSequence = []
    chunks.append(paramSequence)
    ​
    query = f"INSERT INTO {mytable} (myvarchar, myint, myfloat) VALUES (?, ?, ?)"
    ​
    for chunk in chunks:
     cursor = conn.cursor()
     cursor.executemany(query, chunk)
     cursor.close()
     conn.commit()
    # conn.close()
    ​
    sql = f"select * from {mytable}"
    rowsRead = 0
    cursor = conn.cursor()
    cursor.arraysize = 20
    ​
    cursor.execute(sql)
    rc = cursor.rowcount
    rows = cursor.fetchall()
    for row in rows:
     print(row)
    rowsRead += len(rows)
    ​
    cursor.close()
    conn.close()
  2. 安装 DB-API驱动,点击此链接下载DB-API驱动
    pip install intersystems_irispython-version-py3-none-any.whl
  3. 配置Connection String - 按照Intersystems IRIS的服务器,在dbapi.py文件中配置
    • hostname
    • port
    • namespace
    • username
    • password
  1. 在InterSystems IRIS管理门户中创建IRISAPP命名空间。
  2. 在VS code中运行dbapi.py文件,运行结果如下,说明数据成功导入。
    ['This is a non-selective string every row is the same data', 0, 0.0]
    ['This is a non-selective string every row is the same data', 1, 4.57292]
    ['This is a non-selective string every row is the same data', 2, 9.14584]
    ['This is a non-selective string every row is the same data', 3, 13.71876]
    ['This is a non-selective string every row is the same data', 4, 18.29168]
    ['This is a non-selective string every row is the same data', 5, 22.8646]
    ['This is a non-selective string every row is the same data', 6, 27.43752]
    ['This is a non-selective string every row is the same data', 7, 32.01044]
    ['This is a non-selective string every row is the same data', 8, 36.58336]
    ['This is a non-selective string every row is the same data', 9, 41.156279999999995]
  3. 在IRISAPP命名空间下,查看InterSystems IRIS 数据库,可以看到数据,说明数据导入成功InterSystems IRIS。
    SELECT myvarchar, myint, myfloat FROM mypydbapi.test_things
1 Comment
讨论 (1)2
登录或注册以继续