如何使用Java或JavaScript将ASCII艺术解析为HTML?

16

我看到Neo4j API在其API中非常巧妙地使用ASCII艺术:

http://jaxenter.com/getting-started-with-neo4j-the-java-graph-database-47955.html

我想尝试类似的东西,但是用ASCI Art转换成HTML。如何解析ASCII艺术,例如给定一个ASCII Art输入:

--------------------------------
I                              I
I   -------          -------   I
I   I     I          I     I   I
I   I  A  I          I  B  I   I
I   I     I          I     I   I
I   -------          -------   I
I                              I
I                              I
--------------------------------

可能会导致HTML输出类似于:

<div>
    <div style='display:inline;'>
             A
    </div>
    <div style='display:inline;'>
             B
    </div>
</div>

更新

该问题被关闭,理由是我需要“展示对问题的最小理解”。我确实了解要解决的问题。问题是,我想解决的问题是使模板化HTML在以下Web框架源代码中更易于理解:

https://github.com/zubairq/coils

:虽然这个解决方案可以适用于任何Web框架。我已经看到有人尝试在C++中制作初始版本,链接在此:

https://github.com/h3nr1x/asciidivs2html/blob/master/asciidivs2html.cpp

:非常令人印象深刻!如果您能在Java或Clojure中使其工作,并且我们可以重新打开这个问题,那么我将提名赏金,以便您能够获得更多的积分来解决这个问题:)

我运行了@meewok提供的Java解决方案,以下是结果:

$ java AsciiToDIVs.RunConverter
Created a box(ID=0,X=0,Y=0,width=33,height=10)
Created a box(ID=1,X=2,Y=4,width=8,height=5,parent=0)
Created a char(Char=A,X=4,Y=7,parent=1)
Created a box(ID=2,X=2,Y=21,width=8,height=5,parent=0)
Created a char(Char=B,X=4,Y=24,parent=2)
<div><div><div>A</div></div><div><div>B</div></div></div>

3
为什么会有关闭投票?我想在一个能够返回HTML的程序中加入一些ASCII艺术。 - yazzapps.com
6
听起来很有趣。 - 0x_Anakin
7
是的,听起来很有趣,但如果一个新手询问代码或库的建议,问题会立即关闭,对吗? - brasofilo
8
不需要使用解析,使用洪水填充来标记div的区域,提取div坐标,检查边界以获取封闭父div,然后使用此信息构建树形结构,递归树形结构以打印html标记。我今天请病假了=(,所以利用休息时间编写了一个程序,可以在 https://github.com/h3nr1x/asciidivs2html 进行查看。 - higuaro
3
“最少了解”关闭原因可能是因为您没有展示出任何尝试去解决这个问题。您是在要求别人为您写整个东西吗? - Martin Smith
显示剩余40条评论
3个回答

11

方法论

实现方案如下:

  • 创建一个内存二维数组(数组的数组),类似于棋盘。

然后我将创建一个算法,当它检测到“-”字符时,我会调用一个方法来检测剩余角落(右上角,左下角,右下角),跟随字符和它们结束的位置。

例如(快速伪代码):

while(selectedCell==I) selectedCell=selectedCell.goDown();

使用这种策略,您可以绘制出您的框并指出哪些框包含在哪个框中。

剩下的就是将此信息打印为html格式...

快速粗糙的实现

由于我的心情很好,所以我花了一个多小时迅速构建了一个玩具实现。 以下内容与通过单元格进行迭代的优化不相关,并且需要重构才能成为一个严肃的框架。

Cell.java


package AsciiToDIVs;

public class Cell {
    public char Character;
    public CellGrid parentGrid;
    private int rowIndex;
    private int colIndex;

    public Cell(char Character, CellGrid parent, int rowIndex, int colIndex)
    {
        this.Character = Character;
        this.parentGrid = parent;
        this.rowIndex = rowIndex;
        this.colIndex = colIndex;
    }

    public int getRowIndex() {
        return rowIndex;
    }

    public int getColIndex() {
        return colIndex;
    }
}

CellGrid.java


package AsciiToDIVs;

import java.io.BufferedReader;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStreamReader;
import java.util.ArrayList;
import java.util.Iterator;

public class CellGrid {

    private ArrayList<ArrayList<Cell>> CellGridData;

    public CellGrid(String asciiFile) throws IOException {
        readDataFile(asciiFile);
    }

    public ArrayList<FoundObject> findBoxes(FoundBoxObject parent)
    {

        int startRowIndex = 0, startColIndex = 0, 
                parentRowLimit = Integer.MAX_VALUE, 
                parentColLimit = Integer.MAX_VALUE,
                startingColIndex = 0;
        if(parent != null)
        {
            startRowIndex = parent.getRowIndex()+1;
            startColIndex = startingColIndex =  parent.getColIndex()+1;
            parentRowLimit = parent.getRowIndex() + parent.getHeight();
            parentColLimit = parent.getColIndex() + parent.getWidth();
        }

        ArrayList<FoundObject> results = new ArrayList<FoundObject>();

        Cell currentCell;

        if(startRowIndex>=CellGridData.size())
        return null;        

        for(; startRowIndex<CellGridData.size() && startRowIndex<parentRowLimit; startRowIndex++ )
        {
            startColIndex = startingColIndex;

            for(; startColIndex< CellGridData.get(startRowIndex).size() && startColIndex<parentColLimit; startColIndex++)
            {           
                FoundBoxObject withinBox = checkWithinFoundBoxObject(results, startRowIndex, startColIndex);

                if(withinBox !=null)
                startColIndex+=withinBox.getWidth();

                currentCell = getCell(startRowIndex, startColIndex);

                if(currentCell!=null)
                {
                    if(currentCell.Character == '-') // Found a TOP-CORNER
                    {
                        int boxHeight =  getConsecutiveIs(startRowIndex+1, startColIndex) + 1;
                        if(boxHeight>1)
                        {
                            int boxWidth = getConsecutiveDashes(startRowIndex, startColIndex);

                            FoundBoxObject box = new FoundBoxObject(startRowIndex, startColIndex, boxWidth, boxHeight, parent);
                            results.add(box);
                            findBoxes(box);

                            startColIndex+=boxWidth;                            
                        }                   
                    }

                    //This is a character
                    else if(currentCell.Character != '-' && currentCell.Character != 'I' && currentCell.Character != ' ' 
                            && currentCell.Character != '\n' && currentCell.Character != '\n' && currentCell.Character != '\t')
                    {
                        FoundCharObject Char = new FoundCharObject(startRowIndex, startColIndex, parent,  currentCell.Character);
                        results.add(Char);
                    }
                }
            }       
        }

        if(parent!=null)
        parent.containedObjects = results;

        return results;     
    }

    public static String printDIV(ArrayList<FoundObject> objects)
    {
        String result = "";
        Iterator<FoundObject> it = objects.iterator();
        FoundObject fo;

        while(it.hasNext())
        {
            result+="<div>";

            fo = it.next();

            if(fo instanceof FoundCharObject)
            {
                FoundCharObject fc = (FoundCharObject)fo;
                result+=fc.getChar();
            }

            if(fo instanceof FoundBoxObject)
            {
                FoundBoxObject fb = (FoundBoxObject)fo;
                result+=printDIV(fb.containedObjects);
            }

            result+="</div>";
        }

        return result;
    }

    private FoundBoxObject checkWithinFoundBoxObject(ArrayList<FoundObject> results, int rowIndex, int colIndex)
    {
        Iterator<FoundObject> it = results.iterator();
        FoundObject f;
        FoundBoxObject fbox = null;
        while(it.hasNext())
        {
            f = it.next();

            if(f instanceof FoundBoxObject)
            {
                fbox = (FoundBoxObject) f;

                if(rowIndex >= fbox.getRowIndex() && rowIndex <= fbox.getRowIndex() + fbox.getHeight())
                {
                    if(colIndex >= fbox.getColIndex() && colIndex <= fbox.getColIndex() + fbox.getWidth())
                    {
                        return fbox;
                    }
                }
            }
        }

        return null;
    }

    private int getConsecutiveDashes(int startRowIndex, int startColIndex)
    {
        int counter = 0;
        Cell cell = getCell(startRowIndex, startColIndex);

        while( cell!=null && cell.Character =='-')
        {
            counter++;
            cell = getCell(startRowIndex, startColIndex++);
        }

        return counter;

    }

    private int getConsecutiveIs(int startRowIndex, int startColIndex)
    {
        int counter = 0;
        Cell cell = getCell(startRowIndex, startColIndex);

        while( cell!=null && cell.Character =='I')
        {
            counter++;
            cell = getCell(startRowIndex++, startColIndex);
        }

        return counter;
    }

    public Cell getCell(int rowIndex, int columnIndex)
    {
        ArrayList<Cell> row;


        if(rowIndex<CellGridData.size())
        row = CellGridData.get(rowIndex);
        else return null;

        Cell cell = null;

        if(row!=null){
            if(columnIndex<row.size())
            cell = row.get(columnIndex);
        }

        return cell;
    }


    public Iterator<ArrayList<Cell>> getRowGridIterator(int StartRow) {
        Iterator<ArrayList<Cell>> itRow = CellGridData.iterator();

        int CurrentRow = 0;

        while (itRow.hasNext()) {
            // Itrate to Row
            if (CurrentRow++ < StartRow)
                itRow.next();

        }
        return itRow;
    }

    private void readDataFile(String asciiFile) throws IOException {
        CellGridData = new ArrayList<ArrayList<Cell>>();
        ArrayList<Cell> row;

        FileInputStream fstream = new FileInputStream(asciiFile);
        BufferedReader br = new BufferedReader(new InputStreamReader(fstream));

        String strLine;

        // Read File Line By Line
        int rowIndex = 0;
        while ((strLine = br.readLine()) != null) {
            CellGridData.add(row = new ArrayList<Cell>());
            // System.out.println (strLine);
            for (int colIndex = 0; colIndex < strLine.length(); colIndex++) {
                row.add(new Cell(strLine.charAt(colIndex), this, rowIndex,colIndex));
                // System.out.print(strLine.charAt(i));
            }
            rowIndex++;
            // System.out.println();
        }

        // Close the input stream
        br.close();
    }

    public String printGrid() {
        String result = "";

        Iterator<ArrayList<Cell>> itRow = CellGridData.iterator();
        Iterator<Cell> itCol;
        Cell cell;

        while (itRow.hasNext()) {
            itCol = itRow.next().iterator();

            while (itCol.hasNext()) {
                cell = itCol.next();
                result += cell.Character;
            }
            result += "\n";
        }

        return result;
    }

}

FoundBoxObject.java


package AsciiToDIVs;

import java.util.ArrayList;

public class FoundBoxObject extends FoundObject {
    public ArrayList<FoundObject> containedObjects = new ArrayList<FoundObject>();
    public static int boxCounter = 0;

    public final int ID = boxCounter++;

    public FoundBoxObject(int rowIndex, int colIndex, int width, int height, FoundBoxObject parent) {
        super(rowIndex, colIndex, width, height);

        if(parent!=null)
        System.out.println("Created a box(" +
                "ID="+ID+
                ",X="+rowIndex+
                ",Y="+colIndex+
                ",width="+width+
                ",height="+height+
                ",parent="+parent.ID+")");
        else
            System.out.println("Created a box(" +
                    "ID="+ID+
                    ",X="+rowIndex+
                    ",Y="+colIndex+
                    ",width="+width+
                    ",height="+height+
                    ")");   
    }

}

FoundCharObject.java


package AsciiToDIVs;

public class FoundCharObject extends FoundObject {
private Character Char;

public FoundCharObject(int rowIndex, int colIndex,FoundBoxObject parent, char Char) {
    super(rowIndex, colIndex, 1, 1);

    if(parent!=null)
    System.out.println("Created a char(" +
            "Char="+Char+
            ",X="+rowIndex+
            ",Y="+colIndex+
            ",parent="+parent.ID+")");
    else
        System.out.println("Created a char(" +
                ",X="+rowIndex+
                ",Y="+colIndex+")");

    this.Char = Char;
}

public Character getChar() {
    return Char;
}
}

FoundObject.java


package AsciiToDIVs;

public class FoundObject {

    private int rowIndex;
    private int colIndex;
    private int width = 0;
    private int height = 0;

    public FoundObject(int rowIndex, int colIndex, int width, int height )
    {
        this.rowIndex = rowIndex;
        this.colIndex = colIndex;
        this.width = width;
        this.height = height;
    }

    public int getRowIndex() {
        return rowIndex;
    }

    public int getColIndex() {
        return colIndex;
    }

    public int getWidth() {
        return width;
    }

    public int getHeight() {
        return height;
    }
}

主方法


public static void main(String args[])
    {
        try {
            CellGrid grid = new CellGrid("ascii.txt");
            System.out.println(CellGrid.printDIV(grid.findBoxes(null)));
            //System.out.println(grid.printGrid());
        } catch (IOException e) {
            e.printStackTrace();
        }       
    }   

更新

'printDIV' 应该像这样(打印了比需要的更多的 '')。

public static String printDIV(ArrayList<FoundObject> objects)
    {
        String result = "";
        Iterator<FoundObject> it = objects.iterator();
        FoundObject fo;

        while(it.hasNext())
        {
            fo = it.next();

            if(fo instanceof FoundCharObject)
            {
                FoundCharObject fc = (FoundCharObject)fo;
                result+=fc.getChar();
            }

            if(fo instanceof FoundBoxObject)
            {
                result+="<div>";
                FoundBoxObject fb = (FoundBoxObject)fo;
                result+=printDIV(fb.containedObjects);
                result+="</div>";
            }           
        }

        return result;
    }

这似乎是朝着正确的方向。 - yazzapps.com
这似乎适用于我提供的示例情况..请查看我在问题上发布的更新 :) - yazzapps.com
1
我建议您从 .get 转换为使用迭代器来遍历 ArrayList 元素。同时测试不同的情况... - Menelaos
这种方法适用于识别某些框(boxes)的情况,但只是有时候能够奏效。实际上,您需要识别一组以有效方式相互关联的实体。通常来说,这被称为"解析"(而字符串解析在其背后有着广泛的文献支持)。这种方法无法收集实体并验证它们之间的关系,因此我怀疑在复杂情况下它会成功。 - Ira Baxter
1
这种方法是一个玩具示例,可以识别符合特定格式(遵循特定要求)的所有框。显然,特殊情况(例如连接的框)需要进行修改(但其他系统中的规则也需要)。这可以扩展到处理更复杂的形状以满足简单的要求。更复杂的要求将需要更多的代码或基于复杂算法的框架。然而,它足以满足OP的要求,并且可以修改以识别其他形状,例如(三角形,连接等)。 - Menelaos

5

以下是一个在 JavaScript 中比较简单的解决方案,已经通过 Node 进行了测试。当然,您需要调整输入和输出方法。

var s = "\n\
--------------------------------\n\
I                              I\n\
I   -------          -------   I\n\
I   I     I          I     I   I\n\
I   I  A  I          I  B  I   I\n\
I   I     I          I     I   I\n\
I   -------          -------   I\n\
I                              I\n\
I                              I\n\
--------------------------------\n\
";

var lines = s.split('\n');

var outer_box_top_re = /--+/g;

var i;
for (i=0; i<lines.length; i++) {
    while ((res = outer_box_top_re.exec(lines[i])) != null) {
        L = res.index
        R = outer_box_top_re.lastIndex
        process_box(i, L, R)
    }
}

function process_box(T, L, R) {
    console.log('<div top="' + T + '" left="' + L + '" right="' + R + '">')
    blank_out(T, L, R)

    var i = T;
    while (1) {
        i += 1;
        if (i >= lines.length) {
            console.log('Fell off bottom of ascii-art without finding bottom of box');
            process.exit(1);
        }

        var line = lines[i];

        if (line[L] == 'I' && line[R-1] == 'I') {
            // interior

            // Look for (the tops of) sub-boxes.
            // (between L+1 and R-2)
            var inner_box_top_re = /--+/g;
            // Inner and outer need to be separate so that
            // inner doesn't stomp on outer's lastIndex.
            inner_box_top_re.lastIndex = L+1;
            while ((res = inner_box_top_re.exec(lines[i])) != null) {
                sub_L = res.index;
                sub_R = inner_box_top_re.lastIndex;
                if (sub_L > R-1) { break; }
                process_box(i, sub_L, sub_R);
            }

            // Look for any other content (i.e., a box label)
            content = lines[i].substring(L+1, R-1);
            if (content.search(/[^ ]/) != -1) {
                console.log(content);
            }

            blank_out(i, L, R);
        }
        else if (line.substring(L,R).match(/^-+$/)) {
            // bottom
            blank_out(i, L, R);
            break;
        }
        else {
            console.log("line " + i + " doesn't contain a valid continuation of the box");
            process.exit(1)
        }
    }

    console.log('</div>')
}

function blank_out(i, L, R) {
    lines[i] = (
          lines[i].substring(0,L)
        + lines[i].substring(L,R).replace(/./g, ' ')
        + lines[i].substring(R)
    );
}

1
非常好,特别是考虑到这是你第一次在 Stack Overflow 上出现!! :) - yazzapps.com

1
你想要的是二维解析的概念,它可以检测二维实体并验证它们之间是否存在合法关系。
请参见http://mmi.tudelft.nl/pub/siska/TSD%202DVisLangGrammar.pdf
困难在于定义可能的“ASCII艺术”约束集。您只想识别字母吗?由同一字符组成的字母?草书字母?盒子?(您的示例中有边不由相同的ASCII字符构成的盒子)。带有任意粗壁的盒子?嵌套盒子?带有(细/粗)箭头的图表?Kilroy-was-here-nose-over-the-wall?以字符像素提供密度关系的蒙娜丽莎图片?你所说的“ASCII艺术”到底是什么?
真正的问题在于定义您打算识别的范围。如果限制这个范围,您的成功几率会大大提高(请参阅引用的论文)。
这里的问题与Java或JavaScript无关,更多地涉及算法。选择一个有限的艺术类别,选择正确的算法,然后你所面对的就是一个相对容易解决的编程问题。没有限制,没有算法-->没有任何JavaScript都无法拯救你。

1
你引用的论文没有详细说明这样的解析器如何工作。其他文献可能更相关。该论文涉及基于图标关系/排列的紧急语言。此外,OP的要求没有提到图片,因此“蒙娜丽莎的图片,其中字符像素提供密度关系?”有点夸张。OP确实提供了他正在寻找的示例,但缺少一份规格(设计文件),以确定应支持哪些特定功能(识别/关系实体)。 - Menelaos
2
OP说,“ASCII艺术”。这不是一个明确定义的术语。仅展示1个例子并不能构成该类别的定义。你不能通过说“我想解析一些看起来像编程代码的字符串,例如foo.bar(x,y)”来定义解析C++的问题。是的,我们同意他没有定义规范。这就是我所说的,除非发生这种情况,否则他无法取得任何严重的进展。 - Ira Baxter
1
是的,我想我需要更好地说明问题,我同意。 - yazzapps.com

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接