查找

文章
· 一月 30, 2023 阅读大约需 1 分钟

JSONfile-to-Global #2

The power of the presented tool becomes visible with a large JSON file.
The example I have prepared is composed of the daily results of AoC 2023.
Anonymization makes it usable for demo purposes.
As the original, it is an uninterrupted sequence of bytes without any NewLine or similar.
6 660 895 Bytes raw in sequence (as by NTFS). With this content (shrinked):

 
It is loaded into an %DynamicObject using the method %FromJSONfile.
If you try to investigate that object using ZWRITE  you end with <MAXSTRING>
And this is perfectly correct. The maximum string size is exceeded.

If you iterate down to the embedded JSON objects this conflict is gone as they are only ~30kB
The mystery is exercised by %Stream.DynamicCharacter which acts in the background.

The resulting Global is as huge as its source.
But now you can navigate and examine the individual elements.
A snapshot from SMP in the resulting Global top subscript

and total:

I had to run this from Terminal as SMP failed after a few 10000 lines and used this single line instead:

s g="^json" f i=1:1 s g=$q(@g,1,v) q:g=""  w i,":",?8,g,?78,"= ",v,!

Video

GitHub

1 Comment
讨论 (1)1
登录或注册以继续
文章
· 一月 30, 2023 阅读大约需 2 分钟

JSONfile-to-Global #1

The similarity between JSON objects + arrays and Globals in IRIS or Caché is evident.
With small and medium size JSON objects navigation across %Dynamic Objects is comfortable.
But with large and/or deep cascaded objects it becomes a challenge.

The presented tool offers 3 variants

  • loading an already existing %Dyamic object or Array into a global of your choice
  • loading a %Stream containing a JSON object into a global of your choice
  • loading an external File containing a JSON object into a global of your choice

How to use it

USER>read str
{"id":306904,"last_star_ts":0,"completion_day_level":{},"global_score":0,"local_score":0,"stars":0,"name":"name_1"}
USER>set jsn={}.%FromJSON(str)
USER>write ##class(rcc.jstog).toglobal(jsn,"^jsn")
1
USER>zwrite ^jsn
^jsn("global_score")=0
^jsn("id")=306904
^jsn("last_star_ts")=0
^jsn("local_score")=0
^jsn("name")="name_1"
^jsn("stars")=0

USER>zzjson jsn
{
  "id":306904,
  "last_star_ts":0,
  "completion_day_level":{
  },
  "global_score":0,
  "local_score":0,
  "stars":0,
  "name":"name_1"
}
USER>

from an already existing Stream, it's like this


USER>write ##class(rcc.jstog).stream(jsonstream,"^jsstr")
1

and from a file it is this method:


USER>set filename="/opt/irisbuild/src/data/demo.json"
USER>write ##class(rcc.jstog).file(filename)  ; using default gloabel ^jsom
1
USER>

File content:
 

Global content:

This might not look so spectacular.
But see and try the difference with a 6 MB sized JSON Object.

Video

GitHub

1 Comment
讨论 (1)1
登录或注册以继续
文章
· 一月 27, 2023 阅读大约需 1 分钟

Global-Streams-to-SQL #3

Seeing results in SMP or Terminal

If you run the suggested queries from Terminal in  SQL shell you see:



This is not so thrilling as the Stream OID doesn't show the content.
So I have added a dump function to show the stream  rcc.dumps(<column name>)
of course dumping of binaries doesn't make much sense in terminal so CHR and CZIP are typical.
see:


And it works even in SMP { IRIS for Windows (x86-64) 2022.3 (Build 589U)  } processing in foreground


 

Online Demo:  SuperServer Port for external ODBC / JDBC is 60001

I hope for your votes in the contest.

讨论 (0)1
登录或注册以继续
文章
· 一月 23, 2023 阅读大约需 2 分钟

Global-Streams-to-SQL #2

Some technical background information

There is not just one class in this package:  rcc.gstream.cls but also rcc.gstreamT.cls

While rcc.gstream works with direct access to the stream globals, the *T version uses
a Process Private Global (PPG) as Temporary storage.
using  SELECT * FROM RCC.gstreamT WHERE RCC.useT('^jpgS')=1 and similar.

This might be an advantage for multiple access to the same stream in sequence,
The advantage is obvious: You have a personal snapshot in memory and no risk of
a conflict in access. This might be interesting if you work just on 1 specific stream.
The disadvantage is also evident: The merge from Stream Global to PPG takes time. 
 
As a side effect I learned that the compression of stream variants GblChrCompress,
and GblBinCompress only affects the global size on disk. Once in memory, it consumes
the full size, and therefore with xDBC the full uncompressed stream gets sent over the
connection.
Differently, inside IRIX/Caché/Ensemble you only get to the stream's OID. Which makes
sense as streams typically exceed the maximum String length. To visualize this I did
something unusual as I mixed flat and compressed data into the same global.
So all even IDs are compressed streams.  

The code in the repo is written for IRIS.
But using it in Caché/Ensemble/... only requires changing the stream storage type from  
<Type>%Storage.SQL</Type> in IRIS to  <Type>%CacheSQLStorage</Type> for C/E

The base for both classes is Global mapping with SQL Storage.
And it was quite an exercise to make it dynamic.
My personal thanks go to Mike LaRocca who created a presentation about 20 years ago
that I found in my personal archives together with a related similar aged document
"Mapeo de Clases persistentes en Globales"  from a Spanish engineer
who's name was not mentioned there.

As a bonus for those who try the package, there is a medical jpeg not shown in the video.
I guess you will recognize the person immediately.




Hoping for your votes in the tools contest  !

Video

GitHub

讨论 (0)1
登录或注册以继续
文章
· 一月 23, 2023 阅读大约需 2 分钟

Global-Streams-to-SQL

In general Global Streams are data objects embedded in Classes / Tables.
Using and viewing them with SQL is normally a part of the access to the containing tables.

SO WHAT?

During debugging or searching for strange or unexpected behavior there could be the need to 
get closer to the stored stream. No big problem with direct access to Globals with SMP or Terminal.
But with SQL you are lost.
So my tool provides dynamic access to Global Streams wherever you may need this
Special thanks to  @Oliver Wilms  for the inspiration for this tool.    

Mapping of globals to SQL is a rather traditional art from past.
Though Global Streams are somehow specialized.
But even as their Object classes have changed their representation in Global is the same.

The tricky point is that we see 4 different types of Global Streams in a common structure
Only the embedding Class /Table knows the meaning of the content.

  • %Stream.GlobalCharacter   -  raw text
  • %Stream.GblChrCompress - zipped text
  • %Stream.GlobalBinary - raw binary sequence
  • %Stream.GblBinCompress - zipped binary sequence

The Global itself has no indication, of what format it holds.
Dumping the Global just helps for raw text, the rest needs special treatment.
In combination with SQL you meet the problem of maximum field lengths.

I cover this issue by mapping all 4 types over the same stream and the user decides.
In addition, the total size and number of subnodes is also available.
For string manipulation, the first subscript level is also available as "body" VARCHAR
The Global Stream to examine is provided by a static where clause like this:

select * from rcc.gstream where rcc.use('^txtS')=1

  

WinSQL is friendly enough to let you see the full content. eg. for id=1  chr
 

and czip

with SQL the compressed data can be viewed unzipped.

Video

GitHub

1 Comment
讨论 (1)1
登录或注册以继续