请注意,本回答Ruby Regex Error: incompatible encoding regexp match (ASCII-8BIT regexp with UTF-8 string)不适用于我,因为我已经使用的是 Ruby > 1.9。
我正在使用 Rails 4.2.7 和 Ruby 2.3。我有这个表达式:
我正在使用 Rails 4.2.7 和 Ruby 2.3。我有这个表达式:
phrase = phrase.gsub(/\A\p{Space}+|\p{Space}+\z/, '')
很遗憾,如果变量“phrase”的编码为“ASCII-8BIT”,则会出现以下错误。有没有办法以与变量phrase的编码匹配的编码编写上述内容?据我所知,正则表达式会自动编译为UTF-8,即使我的变量可能不是UTF-8。
Encoding::CompatibilityError: incompatible encoding regexp match (UTF-8 regexp with ASCII-8BIT string)
from /Users/davea/Documents/workspace/demoapp/app/services/text_table_to_my_object_time_converter_service.rb:449:in `gsub'
from /Users/davea/Documents/workspace/demoapp/app/services/text_table_to_my_object_time_converter_service.rb:449:in `find_header'
from /Users/davea/Documents/workspace/demoapp/app/services/text_table_to_my_object_time_converter_service.rb:157:in `block in get_headers_by_line'
from /Users/davea/.rvm/gems/ruby-2.3.0/gems/activesupport-4.2.7.1/lib/active_support/core_ext/range/each.rb:7:in `each'
from /Users/davea/.rvm/gems/ruby-2.3.0/gems/activesupport-4.2.7.1/lib/active_support/core_ext/range/each.rb:7:in `each_with_time_with_zone'
from /Users/davea/Documents/workspace/demoapp/app/services/text_table_to_my_object_time_converter_service.rb:156:in `get_headers_by_line'
from /Users/davea/Documents/workspace/demoapp/app/services/text_table_to_my_object_time_converter_service.rb:99:in `get_headers'
from /Users/davea/Documents/workspace/demoapp/app/services/text_table_to_my_object_time_converter_service.rb:243:in `block in get_data_hash'
from /Users/davea/Documents/workspace/demoapp/app/services/text_table_to_my_object_time_converter_service.rb:242:in `each_line'
from /Users/davea/Documents/workspace/demoapp/app/services/text_table_to_my_object_time_converter_service.rb:242:in `get_data_hash'
from /Users/davea/Documents/workspace/demoapp/app/services/text_table_to_my_object_time_converter_service.rb:21:in `get_my_object_times'
from /Users/davea/Documents/workspace/demoapp/app/services/text_processor_service.rb:33:in `process_page_data'
from /Users/davea/Documents/workspace/demoapp/app/services/abstract_import_service.rb:82:in `process_my_object_data'
from (irb):8
from /Users/davea/.rvm/gems/ruby-2.3.0/gems/railties-4.2.7.1/lib/rails/commands/console.rb:110:in `start'
from /Users/davea/.rvm/gems/ruby-2.3.0/gems/railties-4.2.7.1/lib/rails/commands/console.rb:9:in `start'
from /Users/davea/.rvm/gems/ruby-2.3.0/gems/railties-4.2.7.1/lib/rails/commands/commands_tasks.rb:68:in `console'
from /Users/davea/.rvm/gems/ruby-2.3.0/gems/railties-4.2.7.1/lib/rails/commands/commands_tasks.rb:39:in `run_command!'
from /Users/davea/.rvm/gems/ruby-2.3.0/gems/railties-4.2.7.1/lib/rails/commands.rb:17:in `<top (required)>'
from bin/rails:4:in `require'
phrase
具体值吗?另外,ASCII-8BIT
不适用于保存文本,而是纯字节级数据;你在phrase
内容中实际使用的编码是什么?解决方案应该是强制对phrase
进行实际编码的编码,然后编码为UTF-8,然后应用正则表达式。(我相信包含\p{Space}
的正则表达式将是UTF-8。/foo/.encoding
在我的机器上不是UTF-8。) - Amadan"\xc3\xa4"
在强制转换为UTF-8时是"ä"
,但在ISO-8859-1中是"ä"
,在SJIS中是"テ、"
...只要你坚持使用ASCII的下半部分,你可能不会遇到任何错误,因为大多数编码在那里都是相同的,但一旦你涉及到第8位,正则表达式就需要知道确切的编码才能知道什么是“空格”。 - Amadan