M里边怎么解析html文件，用截图的类的第二个参数的模板怎么设置呢

Question

问题

liu bo · 四月 9, 2024

产品版本: Ensemble 2016.1

讨论 (2)1

登录或注册以继续

score 0 · Answer 1 · 2024-04-22T22:58:17-04:00

这个pTemplate参数是一个模板文件 *.tmpl, 示例如下所示：
Do ##class(Ens.Util.HTML.Parser).testFile("http://localhost:57772/csp/samples/menu.csp","C:\test\MenuFind.tmpl",.tOut)

文件 MenuFind.tmpl示例如下：

<b>[sessionevents.csp]</b></A></td><td>{sesstimeoutcomment}
<b>[zipcode.csp]</b>[Demo of using ][ to process client events on the application server.]<A,HREF={zipcodehref}>
=<b>[xmlclasses.csp]</b>[Demo of displaying class instances as XML.]<A,HREF={xmlclasseshref}>
<a,href=http://www.intersystems.com><img,src={cspiscgifname}>
=[CSP Samples Directory][Display class instances as XML]+<tr><A><b>{pageincspdirectory}</b>+
[CSP Samples Directory][Display class instances as XML]<tr><A><b>{pageincspdirectoryagain}
[Text that cannot be matched]{variablethatdoesnotexist}
~<title>{pagetitle}

该示例结果如下：

tOut("cspiscgifname")="created-with-csp.gif"
tOut("pageincspdirectory",1)="xmlimport.csp"
tOut("pageincspdirectory",2)="xmlquery.csp"
tOut("pageincspdirectory",3)="zipcode.csp"
tOut("pageinscpdirectoryagain")="xmlimport.csp"
tOut("pagetitle")="CSP Samples Menu"
tOut("sesstimeoutcomment")="Example of how to use the session timeout event."
tOut("xmlclasseshref")="showsource.csp?PAGE=/csp/samples/xmlclasses.csp"
tOut("zipcodehref")="showsource.csp?PAGE=/csp/samples/zipcode.csp"

*.tmpl语法解析如下：

[] 普通文本。
<> HTML 标签
<,=> 要匹配具有特定属性的 <> 标记。
{} 大括号中的项目匹配字面文本，并将其保存在大括号中命名的下标下。
= 从 HTML 输入的起始位置搜索后续模板内容
~ 从 HTML 输入的起始位置开始搜索后续模板内容，但前提是 HTML 流已到达终点。
++ 在 + 符号之间出现的项目会重复匹配。

score 1 · Answer 2 · 2024-04-26T04:44:43-04:00

liu bo · 四月 26, 2024

解决了，直接不用截取了，转换为纯文本了set htmlSnippet = "<h1>Hello</h1>"
set regex = ##class(%Regex.Matcher).%New("<[^>]*>", htmlSnippet)
set htmlSnippet = regex.ReplaceAll("")

1 0