将字符串转换为对象数组

Question

将字符串转换为对象数组

3

当前我有这个字符串:

"Hello, I'm <%first%> <%last%>, and I <3 being a <%occupation%>. I am > blah";

如何将它分割成一个对象数组，形式如下：

[
  { type: 'string', value: 'Hello, I'm ' },
  { type: 'token', value: 'first' },
  { type: 'string', value: ' ' },
  { type: 'token', value: 'last' },
  { type: 'string', value: ', and I <3 being a ' },
  { type: 'token', value: 'occupation' },
  {type: 'string', value: 'I am > blah'}
]

字符串的模式是：如果我们找到一个看起来像这样的单词：<%word%>，那么我们将其作为类型为token的对象放入数组中。否则，我们将其作为类型为string的对象放入数组中。我的问题是如何在代码中实现它。

我很难形成短语，该短语将成为键值对中的值。下面是我试图实现的代码，但它是有缺陷的。我的想法是有一个空的word变量，当它遍历字符串时，会将单词连接起来形成短语。一旦它看到它在<%和%>之间，它就会被赋予类型token并推入arrStr数组中。

然而有两个问题：1) 看起来每次迭代字符串从"H"到"He"到"Hel"到"Hell"到"Hello"都被生成了。2) 似乎它从未接触到token。3) "hello"中的第一个"h"被省略了。

我该如何在不使用正则表达式的情况下完成此操作？

- lolcatapril24

7个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Bulent · Answer 1

我相信正则表达式会是更好的解决方案，但我不知道它。所以这是我的方法：

let txt = "Hello, I'm <%first%> <%last%>, and I <3 being a <%occupation%>. I am > blah";

var arr = [];

txt.split("<%").forEach(x => {
  if (x.includes("%>")) {
    x.split("%>").forEach((y, i) => {
      arr.push({type: i == 0 ? 'token' : 'string', value: y});
    })
  } else {
    arr.push({type: 'string', value: x});
  }
});

console.log(arr);

- Guerric P · Answer 2

你可以使用类似这样的正则表达式解析字符串：

const input = "Hello, I'm <%first%> <%last%>, and I <3 being a <%occupation%>. I am > blah";

const regex = /<%([^%]+)%>/g;

let result;
let parsed = [];
let lastIndex = 0;
while(result = regex.exec(input)) {
  const { 1: match, index } = result;
  parsed.push({ type: 'string', value: input.substring(lastIndex, index) });
  parsed.push({ type: 'token', value: match });
  lastIndex = index + match.length + 4;
}
parsed.push({ type: 'string', value: input.substring(lastIndex) });


console.log(parsed);

- Majed Badawi · Answer 3

使用正则表达式：

const convertToObjectArr = (str = '') => {
  const reg = /<%(.*?)%>/g;
  let arr = [];
  let index = 0, lastIndex = 0;

  while (match = reg.exec(str)) {
    index = match.index;
    if(index > lastIndex) {
      arr.push({ type: 'string', value: str.substring(lastIndex, index) });
    }
    lastIndex = reg.lastIndex;
    arr.push({ type: 'token', value: str.substring(lastIndex-2, index+2) });
  }
  
  if (lastIndex == 0) {
    arr.push({ type: 'string', value: str.substring(lastIndex) });
  } else if (lastIndex < str.length) {
    arr.push({ type: 'token', value: str.substring(lastIndex) });
  }
  
  return arr;
}

console.log( convertToObjectArr("Hello, I'm <%first%> <%last%>, and I <3 being a <%occupation%>. I am > blah") );
console.log( convertToObjectArr("Hello") );
console.log( convertToObjectArr("Hello, I'm <%first%>") );

- Cody E · Answer 4

这是我的方法。如果您使用正则表达式，可以大大简化此过程（这基本上就是正则表达式的确切用例）。

target = "Hello, I'm <%first%> <%last%>, and I <3 being a <%occupation%>. I am > blah"

const split = (el) => {
    const parts = el.split('%>')
    return [{type: 'token', value: parts[0]}, {type: 'string', value: parts[1]}]
}

const parts = target.split('<%')
              .map(el => 
                    el.indexOf('%>') === -1 
                    ? {type: "string", value: el} 
                    : split(el)
              ).flat()

console.log(parts)

- customcommander · Answer 5

您可以使用词法分析器对字符串进行标记化。在此示例中，我使用了moo：

const lexer =
  moo.compile({ token: {match: /<%.+?%>/, value: raw => raw.slice(2, -2)}
              , strlt: /.+?(?=<%)/
              , strrt: /(?<=%>).+/ });

lexer.reset("Hello, I'm <%first%> <%last%>, and I <3 being a <%occupation%>. I am > blah");

console.log(

  [...lexer].map(({ type, value }) =>
    ({ type: (type === 'token' ? type : 'string')
     , value }))

);

<script src="https://unpkg.com/moo@0.5.1/moo.js"></script>

- Kinglish · Answer 6

var str = "Hello, I'm <%first%> <%last%>, and I <3 being a <%occupation%>. I am > blah";
var arr = str.split('<%');
var final = [], vala, thetype;
arr.forEach(val => {
 vala=val.split('%>');
 thetype='string';
 if (vala.length>0) thetype='token';
 final.push({type: thetype, value: vala[0]});
})

- Som Shekhar Mukherjee · Answer 7

使用两次替换

让我们使用一些虚拟字符串来替换角括号，以便轻松分割数据。

因此，我们仅将<%替换为*#，并仅将%>替换为*。

然后我们只用*来split数据。分割后，我们得到以下数组

[
  "Hello, I'm ",
  "#first",
  " ",
  "#last",
  ", and I <3 being a ",
  "#occupation",
  ". I am > blah",
]

因此，我们只在开头的角括号中添加了额外的#，这将帮助我们区分令牌和字符串。

现在，我们可以简单地映射这个数组并获得所需的结果。

注意：您可以使用一个适当的虚拟字符串，确保它不会出现在字符串中。

const 
  str = "Hello, I'm <%first%> <%last%>, and I <3 being a <%occupation%>. I am > blah",
      
  res = str
    .replace(/<%/g, "*#")
    .replace(/%>/g, "*")
    .split("*")
    .map((v) =>
      v[0] === "#"
        ? { type: "token", value: v.slice(1) }
        : { type: "string", value: v }
    );

console.log(res);