在Objective-C中解析MIME编码的单词

3
OS X或iOS是否提供API来解析MIME编码单词?这些可爱的字符串:
=?iso-8859-1?Q?=A1Hola,_se=F1or!?=

另外,是否有已知的开源库可以实现这个功能?


我建议使用LibEtPan,但仅仅为了MIME解析似乎有点过头了。 - CodaFi
2个回答

2
我使用了 @NickolayO. 的 答案,并添加了 base-64 支持,使用了 QSStrings 来缩短代码,并使用了 componentsSeparatedByString:stringByReplacingOccurrencesOfString:withString:
我已经将代码放在了 GitHub 上。这里是一个方便的片段:
@implementation NSString (MimeEncodedWord)

- (BOOL) isMimeEncodedWord
{
    return [self hasPrefix:@"=?"]  && [self hasSuffix:@"?="];
}

+ (NSString*) stringWithMimeEncodedWord:(NSString*)word
{ // Example: =?iso-8859-1?Q?=A1Hola,_se=F1or!?=
    NSArray *components = [word componentsSeparatedByString:@"?"];
    if (components.count < 5) return nil;

    NSString *charset = [components objectAtIndex:1];
    NSStringEncoding encoding = CFStringConvertEncodingToNSStringEncoding(CFStringConvertIANACharSetNameToEncoding((CFStringRef)charset)); // TODO: What happens if the encoding is invalid?

    NSString *encodingType = [components objectAtIndex:2];
    NSString *encodedText = [components objectAtIndex:3];
    if ([encodingType isEqualToString:@"Q"])
    { // quoted-printable
        encodedText = [encodedText stringByReplacingOccurrencesOfString:@"_" withString:@" "];
        encodedText = [encodedText stringByReplacingOccurrencesOfString:@"=" withString:@"%"];
        NSString *decoded = [encodedText stringByReplacingPercentEscapesUsingEncoding:encoding];
        return decoded;
    } else if ([encodingType isEqualToString:@"B"])
    { // base64
        NSData *data = [QSStrings decodeBase64WithString:encodedText];
        NSString *decoded = [[NSString alloc] initWithData:data encoding:encoding];
        return decoded;
    } else {
        NSLog(@"%@ is not a valid encoding (must be Q or B)", encodingType);
        return nil;
    }    
}

@end

你缺少了一些输入验证。不过,这看起来比我的 C 风格要简短、更好 :> - Nickolay Olshevsky
谢谢!将验证作为单独的方法添加。 :) - hpique
刚注意到上面的代码,如果单词包含百分号,则会返回nil。通过Twitter发送的电子邮件进行了测试,因此我猜测错误不在服务器端。我通过在“// quoted-printable”注释之后添加以下行来解决此问题:“encodedText = [encodedText stringByReplacingOccurrencesOfString:@”%“ withString:[@”%“ stringByAddingPercentEscapesUsingEncoding:encoding]];” - Toby

1

这里是引用可打印版本的代码(Mime Encoded-word 可以是引用可打印或 base-64 编码)。对于 base64 编码,您应该类似地进行操作,将引用可打印解码替换为 base64 解码。可能需要对此代码进行一些测试。此外,它仅支持 NSString 的编码。

- (NSString*) decodeMimeEncodedWord:(NSString*)word
{
    if (![word hasPrefix:@"=?"] || ![word hasSuffix:@"?="])
        return nil;

    int i = 2;
    while ((i < word.length) && ([word characterAtIndex:i] != (unichar)'?'))
        i++;

    if (i >= word.length - 4)
        return nil;

    NSString *encodingName = [word substringWithRange:NSMakeRange(2, i - 2)];
    NSStringEncoding encoding = CFStringConvertEncodingToNSStringEncoding(CFStringConvertIANACharSetNameToEncoding((CFStringRef)encodingName));
    // warning! can return 'undefined something' if encodingName is invalid or unknown

    NSString *encodedString;

    if ([[word substringWithRange:NSMakeRange(i + 1, 2)] isEqualToString:@"Q?"])
    {
        // quoted-printable
        encodedString = [word substringWithRange:NSMakeRange(i + 3, word.length - i - 5)];
        NSMutableData *binaryString = [[[NSMutableData alloc] initWithLength:encodedString.length] autorelease];
        unsigned char *binaryBytes = (unsigned char*)[binaryString mutableBytes];
        int j = 0;
        char h;     

        for (i = 0; i < encodedString.length; i++)
        {
            unichar ch = [encodedString characterAtIndex:i];
            if (ch == (unichar)'_')
                binaryBytes[j++] = ' ';
            else if (ch == (unichar)'=')
            {
                if (i >= encodedString.length - 2)
                    return nil;

                unsigned char val = 0;

                // high-order hex char
                h = [encodedString characterAtIndex:++i];
                if ((h >= '0') && (h <= '9'))
                    val += ((int)(h - '0')) << 4;
                else if ((h >= 'A') && (h <= 'F'))
                    val += ((int)(h + 10 - 'A')) << 4;
                else
                    return nil;
                // low-order hex char
                h = [encodedString characterAtIndex:++i];
                if ((h >= '0') && (h <= '9'))
                    val += (int)(h - '0');
                else if ((h >= 'A') && (h <= 'F'))
                    val += (int)(h + 10 - 'A');
                else
                    return nil;

                binaryBytes[j++] = val;             
            }
            else if (ch < 256)
                binaryBytes[j++] = ch;
            else
                return nil;
        }

        binaryBytes[++j] = 0;
        [binaryString setLength:j];

        NSString *result = [[NSString alloc] initWithCString:[binaryString mutableBytes] encoding:encoding];        
        // warning! can return 'undefined something' if encoding is invalid or unknown

        return result;
    }
    else if ([[word substringWithRange:NSMakeRange(i + 1, 2)] isEqualToString:@"B?"])
    {
        // base64-encoded       
        return nil;
    }
    else
        return nil;
}

你能提供代码吗?我不确定如何将上述字符串“转换为具有编码字节的NSData”。 - hpique
好的,我会做的,请稍等几分钟。 - Nickolay Olshevsky
+1 这非常有帮助,指引了我正确的方向。谢谢! - hpique

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接