使用正则表达式在CSV文件中替换引号内的逗号。

4
我们举个例子,有这样一个字符串:
"COURSE",247,"28/4/2016 12:53 Europe/Brussels",1,"Verschil tussen merk, product en leveranciersverantwoordelijke NL","Active Enro"

目标是替换"merk, product"之间的逗号,并保留像","和", & ,"这样的逗号,以便我们可以正确地拆分文件。

有什么建议吗?

顺祝商祺


只有在被 A-Z 包围时才应该执行正则表达式。 - FastSolutions
1
我的建议是根据您使用的编程语言选择一个合适的CSV解析器。 - Sebastian Proske
1
使用 ,(?!(?:[^"]*"[^"]*")*[^"]*$) 并替换为空。 - Wiktor Stribiżew
@WiktorStribiżew 谢谢,麻烦您发布回复,它完美运行。 - FastSolutions
这不是一个编写代码的服务。你目前尝试了什么? - vwegert
显示剩余3条评论
3个回答

11

首先,您应该查看了解CSV文件及其在ABAP中的处理文章。

对于一个一次性任务,您可以使用这个正则表达式(但请注意,对于较长的字符串,它可能不起作用,请将其作为最后手段使用):

,(?!(?:[^"]*"[^"]*")*[^"]*$)

请查看正则表达式演示

模式详细信息

  • , - 一个逗号,后面不跟...
  • (?! - 不跟随...
    • (?: -
      • [^"]* - 零个或多个非 " 字符
      • " - 一个双引号
      • [^"]*" - 参见上方
    • )* - 上述分组模式的零个或多个序列
    • [^"]* - 零个或多个非 " 字符
    • $ - 字符串结束
  • ) - 负向先行断言的结尾

请看下面的代码,但是我已经接受了你的答案,因为它让我找到了最终的解决方案。 - FastSolutions

3
我发现一种比正则表达式更好的解决方案,使用CL_RSDA_CSV_CONVERTER类即可,无需重新发明轮子。
请看下面的代码:
TYPES: BEGIN OF ttab,
         rec(1000) TYPE c,
       END OF ttab.
TYPES: BEGIN OF tdat,
         userid(100)                TYPE c,
         activeuser(100)            TYPE c,
         firstname(100)             TYPE c,
         lastname(100)              TYPE c,
         middlename(100)            TYPE c,
         supervisor(100)            TYPE c,
         supervisor_firstname(100)  TYPE c,
         supervisor_lastname(100)   TYPE c,
         supervisor_middle(100)     TYPE c,
         scheduled_offering_id(100) TYPE c,
         description(100)           TYPE c,
         domain(100)                TYPE c,
         registration(100)          TYPE c,
         current_registration(100)  TYPE c,
         max_registration(100)      TYPE c,
         item_type(100)             TYPE c,
         item_id(100)               TYPE c,
         item_revision_date(100)    TYPE c,
         revision_number(100)       TYPE c,
         title(100)                 TYPE c,
         status(100)                TYPE c,
         start_date(100)            TYPE c,
         end_date(100)              TYPE c,
         location(100)              TYPE c,
         instructor_fistname(100)   TYPE c,
         instructor_lastname(100)   TYPE c,
         instructor_middlename(100) TYPE c,
         column_number(100)         TYPE c,
         label(100)                 TYPE c,
         value(100)                 TYPE c,
         description2(100)          TYPE c,
         start_date_short(100)      TYPE c,
         begda                      TYPE begda,
         start_time(100)            TYPE c,
         start_time_24_hour(100)    TYPE c,
         start_12_hour_type(100)    TYPE c,
         start_timezone(100)        TYPE c,
         end_date_short(100)        TYPE c,
         endda                      TYPE endda,
         end_time(100)              TYPE c,
         end_time_24_hour(100)      TYPE c,
         end_12_hour_type(100)      TYPE c,
         end_timezone(100)          TYPE c,
         pernr                      TYPE pernr_d,
       END OF tdat.

CONSTANTS: co_delete           TYPE pspar-actio VALUE 'DEL',
           co_attendance       TYPE string VALUE '2002',
           co_att_prelp        TYPE prelp-infty VALUE '2002',
           co_att_subty        TYPE string VALUE '3000'.

DATA:
  itab                 TYPE TABLE OF ttab WITH HEADER LINE,
  idat                 TYPE TABLE OF tdat WITH HEADER LINE,
  lw_idat              LIKE LINE OF idat,
  lw_found_training    LIKE LINE OF idat,
  file_str             TYPE string,
  lv_uname             TYPE syuname,
  lo_person            TYPE REF TO zhr_cl_pa_person,
  lv_input_time        TYPE tims,
  lv_output_time       TYPE tims,
  lv_day(2)            TYPE c,
  lv_month(2)          TYPE c,
  lv_year(4)           TYPE c,
  lv_time(6)           TYPE c,
  lv_abap_date         TYPE string,
  lv_lock_return       LIKE bapireturn1,
  ls_attendance        LIKE bapihrabsatt_in,
  lt_attendance_output TYPE TABLE OF bapiret2,
  ls_return            LIKE bapireturn,
  ls_return1           LIKE bapireturn1,
  lt_absatt_data       TYPE TABLE OF pprop,
  lw_absatt_data       LIKE LINE OF lt_absatt_data,
  lt_pa2002            TYPE TABLE OF pa2002,
  lw_pa2002            LIKE LINE OF lt_pa2002,
  lw_msg               TYPE bapireturn1,
  lt_p2002             TYPE TABLE OF p2002,
  lw_p2002             LIKE LINE OF lt_p2002,
  lc_pgmid             TYPE old_prog VALUE 'ZKA_TEXT_UPDATE',
  lr_upd_cluster       TYPE REF TO cl_hrpa_text_cluster,
  ls_text              TYPE hrpad_text,
  ls_pskey             TYPE pskey,
  lt_text_194          TYPE hrpad_text_tab,
  lv_text              TYPE string,
  lo_ref               TYPE REF TO cx_hrpa_invalid_parameter,
  lw_struct            TYPE tdat,
  lo_csv               TYPE REF TO cl_rsda_csv_converter.

CALL METHOD cl_rsda_csv_converter=>create
  RECEIVING
    r_r_conv = lo_csv.
CREATE OBJECT lr_upd_cluster.

*--------------------------------------------------*
* selection screen design
*-------------------------------------------------*

SELECTION-SCREEN BEGIN OF BLOCK selection1 WITH FRAME.
PARAMETERS: p_file TYPE localfile.
SELECTION-SCREEN SKIP.
SELECTION-SCREEN BEGIN OF LINE.
SELECTION-SCREEN COMMENT 4(51) text-002.
PARAMETERS p_futatt AS CHECKBOX DEFAULT 'X'.
SELECTION-SCREEN END OF LINE.
SELECTION-SCREEN BEGIN OF LINE.
SELECTION-SCREEN COMMENT 4(51) text-001.
PARAMETERS p_active AS CHECKBOX DEFAULT 'X'.
SELECTION-SCREEN END OF LINE.
SELECTION-SCREEN END OF BLOCK selection1.

*--------------------------------------------------*
* at selection screen for field
*-------------------------------------------------*
AT SELECTION-SCREEN ON VALUE-REQUEST FOR p_file.
  CALL FUNCTION 'KD_GET_FILENAME_ON_F4'
    EXPORTING
      static    = 'X'
    CHANGING
      file_name = p_file.

*--------------------------------------------------*
* start of selection
*-------------------------------------------------*
START-OF-SELECTION.
  file_str = p_file.
  CALL FUNCTION 'GUI_UPLOAD'
    EXPORTING
      filename                = file_str
    TABLES
      data_tab                = itab
    EXCEPTIONS
      file_open_error         = 1
      file_read_error         = 2
      no_batch                = 3
      gui_refuse_filetransfer = 4
      invalid_type            = 5
      no_authority            = 6
      unknown_error           = 7
      bad_data_format         = 8
      header_not_allowed      = 9
      separator_not_allowed   = 10
      header_too_long         = 11
      unknown_dp_error        = 12
      access_denied           = 13
      dp_out_of_memory        = 14
      disk_full               = 15
      dp_timeout              = 16
      OTHERS                  = 17.

*--------------------------------------------------------------------*
* Delete file headers
*--------------------------------------------------------------------*
  DELETE itab INDEX 1.

*--------------------------------------------------*
* process and display output
*-------------------------------------------------*

  LOOP AT itab  .
    CLEAR idat.

    CALL METHOD lo_csv->csv_to_structure
      EXPORTING
        i_data   = itab-rec
      IMPORTING
        e_s_data = lw_struct.

    MOVE-CORRESPONDING lw_struct TO idat.

    APPEND idat.
  ENDLOOP.

很好,你找到了解决方案。 - Wiktor Stribiżew

2
  1. 使用CSV读取器读取文件。
  2. 替换每个字段中的逗号。
  3. 使用CSV写入器写入文件。

您不需要使用正则表达式完成此任务。


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接