如何为Spring Batch数据和业务数据配置单独的Java数据源?是否有必要这样做?

48

我的主要工作仅涉及读取操作,而另一个工作则进行一些写入操作但使用的是MyISAM engine,它忽略事务,因此我不一定需要事务支持。如何配置Spring Batch以拥有其自己的数据源用于JobRepository,与持有业务数据的数据源分离?最初的数据源配置如下所示:

@Configuration
public class StandaloneInfrastructureConfiguration {

    @Autowired
    Environment env;

    @Bean
    public LocalContainerEntityManagerFactoryBean entityManagerFactory() {
      LocalContainerEntityManagerFactoryBean em = new LocalContainerEntityManagerFactoryBean();
      em.setDataSource(dataSource());
      em.setPackagesToScan(new String[] { "org.podcastpedia.batch.*" });

      JpaVendorAdapter vendorAdapter = new HibernateJpaVendorAdapter();
      em.setJpaVendorAdapter(vendorAdapter);
      em.setJpaProperties(additionalJpaProperties());

      return em;
    }

    Properties additionalJpaProperties() {
          Properties properties = new Properties();
          properties.setProperty("hibernate.hbm2ddl.auto", "none");
          properties.setProperty("hibernate.dialect", "org.hibernate.dialect.MySQL5Dialect");
          properties.setProperty("hibernate.show_sql", "true");

          return properties;
    }

    @Bean
    public DataSource dataSource(){

       return DataSourceBuilder.create()
                .url(env.getProperty("db.url"))
                .driverClassName(env.getProperty("db.driver"))
                .username(env.getProperty("db.username"))
                .password(env.getProperty("db.password"))
                .build();          
    }

    @Bean
    public PlatformTransactionManager transactionManager(EntityManagerFactory emf){
      JpaTransactionManager transactionManager = new JpaTransactionManager();
      transactionManager.setEntityManagerFactory(emf);

      return transactionManager;
    }
}

然后将它在Job的配置类中导入,其中的@EnableBatchProcessing注释会自动使用它。 我最初的想法是尝试设置配置类扩展DefaultBatchConfigurer,但是我遇到了一个

  

BeanCurrentlyInCreationException(org.springframework.beans.factory.BeanCurrentlyInCreationException:创建bean时出错,名称为jobBuilders:请求的bean当前正在创建中:是否存在无法解决的循环引用?):

@Configuration
@EnableBatchProcessing
@Import({StandaloneInfrastructureConfiguration.class, NotifySubscribersServicesConfiguration.class})
public class NotifySubscribersJobConfiguration extends DefaultBatchConfigurer {

    @Autowired
    private JobBuilderFactory jobBuilders;

    @Autowired
    private StepBuilderFactory stepBuilders;

    @Autowired
    private DataSource dataSource;

    @Autowired
    Environment env;

    @Override
    @Autowired
    public void setDataSource(javax.sql.DataSource dataSource) {
        super.setDataSource(batchDataSource());
    }

    private DataSource batchDataSource(){          
       return DataSourceBuilder.create()
                .url(env.getProperty("batchdb.url"))
                .driverClassName(env.getProperty("batchdb.driver"))
                .username(env.getProperty("batchdb.username"))
                .password(env.getProperty("batchdb.password"))
                .build();          
    } 

    @Bean
    public ItemReader<User> notifySubscribersReader(){

        JdbcCursorItemReader<User> reader = new JdbcCursorItemReader<User>();
        String sql = "select * from users where is_email_subscriber is not null";

        reader.setSql(sql);
        reader.setDataSource(dataSource);
        reader.setRowMapper(rowMapper());       

        return reader;
    }
........
}   

非常欢迎提出任何想法。该项目可在GitHub上获得 - https://github.com/podcastpedia/podcastpedia-batch

非常感谢。

7个回答

30

好的,这很奇怪但是它有效。将数据源移动到它自己的配置类中可以正常工作,并且可以进行自动装配。

示例是Spring Batch Service Example的多数据源版本:

DataSourceConfiguration:

public class DataSourceConfiguration {

    @Value("classpath:schema-mysql.sql")
    private Resource schemaScript;

    @Bean
    @Primary
    public DataSource hsqldbDataSource() throws SQLException {
        final SimpleDriverDataSource dataSource = new SimpleDriverDataSource();
        dataSource.setDriver(new org.hsqldb.jdbcDriver());
        dataSource.setUrl("jdbc:hsqldb:mem:mydb");
        dataSource.setUsername("sa");
        dataSource.setPassword("");
        return dataSource;
    }

    @Bean
    public JdbcTemplate jdbcTemplate(final DataSource dataSource) {
        return new JdbcTemplate(dataSource);
    }

    @Bean
    public DataSource mysqlDataSource() throws SQLException {
        final SimpleDriverDataSource dataSource = new SimpleDriverDataSource();
        dataSource.setDriver(new com.mysql.jdbc.Driver());
        dataSource.setUrl("jdbc:mysql://localhost/spring_batch_example");
        dataSource.setUsername("test");
        dataSource.setPassword("test");
        DatabasePopulatorUtils.execute(databasePopulator(), dataSource);
        return dataSource;
    }

    @Bean
    public JdbcTemplate mysqlJdbcTemplate(@Qualifier("mysqlDataSource") final DataSource dataSource) {
        return new JdbcTemplate(dataSource);
    }

    private DatabasePopulator databasePopulator() {
        final ResourceDatabasePopulator populator = new ResourceDatabasePopulator();
        populator.addScript(schemaScript);
        return populator;
    }
}

批次配置:

@Configuration
@EnableBatchProcessing
@Import({ DataSourceConfiguration.class, MBeanExporterConfig.class })
public class BatchConfiguration {

    @Autowired
    private JobBuilderFactory jobs;

    @Autowired
    private StepBuilderFactory steps;

    @Bean
    public ItemReader<Person> reader() {
        final FlatFileItemReader<Person> reader = new FlatFileItemReader<Person>();
        reader.setResource(new ClassPathResource("sample-data.csv"));
        reader.setLineMapper(new DefaultLineMapper<Person>() {
            {
                setLineTokenizer(new DelimitedLineTokenizer() {
                    {
                        setNames(new String[] { "firstName", "lastName" });
                    }
                });
                setFieldSetMapper(new BeanWrapperFieldSetMapper<Person>() {
                    {
                        setTargetType(Person.class);
                    }
                });
            }
        });
        return reader;
    }

    @Bean
    public ItemProcessor<Person, Person> processor() {
        return new PersonItemProcessor();
    }

    @Bean
    public ItemWriter<Person> writer(@Qualifier("mysqlDataSource") final DataSource dataSource) {
        final JdbcBatchItemWriter<Person> writer = new JdbcBatchItemWriter<Person>();
        writer.setItemSqlParameterSourceProvider(new BeanPropertyItemSqlParameterSourceProvider<Person>());
        writer.setSql("INSERT INTO people (first_name, last_name) VALUES (:firstName, :lastName)");
        writer.setDataSource(dataSource);
        return writer;
    }

    @Bean
    public Job importUserJob(final Step s1) {
        return jobs.get("importUserJob").incrementer(new RunIdIncrementer()).flow(s1).end().build();
    }

    @Bean
    public Step step1(final ItemReader<Person> reader,
            final ItemWriter<Person> writer, final ItemProcessor<Person, Person> processor) {
        return steps.get("step1")
                .<Person, Person> chunk(1)
                .reader(reader)
                .processor(processor)
                .writer(writer)
                .build();
    }
}

2
这是一个完美的答案。在我添加了业务数据源后,Spring Batch 也尝试将其用于元数据表,这让人感到困惑。令人烦恼的是,这是默认行为,而不是需要显式覆盖。 - bdetweiler
1
如果您想使用配置属性来加载数据源的配置值,那么这个链接 https://docs.spring.io/spring-boot/docs/current/reference/htmlsingle/#howto-two-datasources 展示了一种类似的方法,但可以自动加载自定义属性。 - Philippe
是的,我有H2依赖关系。因此,批量需求的数据源必须手动指定吗? - marekmuratow
请问您能否在这里给我指导一下:https://stackoverflow.com/questions/62510899/spring-batch-create-two-datasources-and-how-to-customized-to-use-other-propert? - PAA
(还没有尝试过,但)需要在 DataSourceConfiguration 类上显式注释 @Configuration 吗? - S3lvatico

10
我将我的数据源存储在一个单独的配置类中。在批处理配置中,我们继承DefaultBatchConfigurer并覆盖setDataSource方法,通过@Qualifier传递特定的数据库给Spring Batch使用。我无法使用构造函数版本使其运行,但对我来说Setter方法可以工作。
我的Reader、Processor和Writer是在它们自己的封装类中,以及步骤。
这是使用Spring Boot 1.1.8和Spring Batch 3.0.1。注意:我们使用Spring Boot 1.1.5进行项目设置时,与新版本不同。
package org.sample.config.jdbc;

import javax.sql.DataSource;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.context.annotation.Primary;
import org.springframework.core.env.Environment;

import com.atomikos.jdbc.AtomikosDataSourceBean;
import com.mysql.jdbc.jdbc2.optional.MysqlXADataSource;

/**
 * The Class DataSourceConfiguration.
 *
 */
@Configuration
public class DataSourceConfig {

    private final static Logger log = LoggerFactory.getLogger(DataSourceConfig.class);

    @Autowired private Environment env;

    /**
     * Siphon data source.
     *
     * @return the data source
     */
    @Bean(name = "mainDataSource")
    @Primary
    public DataSource mainDataSource() {

        final String user = this.env.getProperty("db.main.username");
        final String password = this.env.getProperty("db.main.password");
        final String url = this.env.getProperty("db.main.url");

        return this.getMysqlXADataSource(url, user, password);
    }

    /**
     * Batch data source.
     *
     * @return the data source
     */
    @Bean(name = "batchDataSource", initMethod = "init", destroyMethod = "close")
    public DataSource batchDataSource() {

        final String user = this.env.getProperty("db.batch.username");
        final String password = this.env.getProperty("db.batch.password");
        final String url = this.env.getProperty("db.batch.url");

        return this.getAtomikosDataSource("metaDataSource", this.getMysqlXADataSource(url, user, password));
    }

    /**
     * Gets the mysql xa data source.
     *
     * @param url the url
     * @param user the user
     * @param password the password
     * @return the mysql xa data source
     */
    private MysqlXADataSource getMysqlXADataSource(final String url, final String user, final String password) {

        final MysqlXADataSource mysql = new MysqlXADataSource();
        mysql.setUser(user);
        mysql.setPassword(password);
        mysql.setUrl(url);
        mysql.setPinGlobalTxToPhysicalConnection(true);

        return mysql;
    }

    /**
     * Gets the atomikos data source.
     *
     * @param resourceName the resource name
     * @param xaDataSource the xa data source
     * @return the atomikos data source
     */
    private AtomikosDataSourceBean getAtomikosDataSource(final String resourceName, final MysqlXADataSource xaDataSource) {

        final AtomikosDataSourceBean atomikos = new AtomikosDataSourceBean();
        atomikos.setUniqueResourceName(resourceName);
        atomikos.setXaDataSource(xaDataSource);
        atomikos.setMaxLifetime(3600);
        atomikos.setMinPoolSize(2);
        atomikos.setMaxPoolSize(10);

        return atomikos;
    }

}


package org.sample.settlement.batch;

import javax.sql.DataSource;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.batch.core.Job;
import org.springframework.batch.core.Step;
import org.springframework.batch.core.configuration.annotation.DefaultBatchConfigurer;
import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing;
import org.springframework.batch.core.configuration.annotation.JobBuilderFactory;
import org.springframework.batch.core.configuration.annotation.StepBuilderFactory;
import org.springframework.batch.core.launch.support.RunIdIncrementer;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.beans.factory.annotation.Qualifier;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.transaction.PlatformTransactionManager;

/**
 * The Class BatchConfiguration.
 *
 */
@Configuration
@EnableBatchProcessing
public class BatchConfiguration extends DefaultBatchConfigurer {
    private final static Logger log = LoggerFactory.getLogger(BatchConfiguration.class);
    @Autowired private JobBuilderFactory jobs;
    @Autowired private StepBuilderFactory steps;
    @Autowired private PlatformTransactionManager transactionManager;
    @Autowired @Qualifier("processStep") private Step processStep;

    /**
     * Process payments job.
     *
     * @return the job
     */
    @Bean(name = "processJob")
    public Job processJob() {
        return this.jobs.get("processJob")
                    .incrementer(new RunIdIncrementer())
                    .start(processStep)
                    .build();
    }

    @Override
    @Autowired
    public void setDataSource(@Qualifier("batchDataSource") DataSource batchDataSource) {
        super.setDataSource(batchDataSource);
    }
}

1
覆盖DefaultBatchConfigurer的setDataSource方法是我所需要的。 - Cameron Chapman
1
救了我的一天!谢谢老兄!o/ - diogo
我尝试过了,但是批处理表是在主数据库中创建的,当批处理实际运行时,它会在批处理数据库中搜索这些表,但找不到并失败了。如果我在批处理数据库上创建表,然后配置为不在第二次运行时创建表,那么它就可以正常工作! - Rajan
Spring Batch 4.1.1 - DefaultBatchConfigurer没有setDataSource()方法。 - pojo-guy
1
重写DefaultBatchConfigurer.setDataSource真的很有帮助。谢谢! - Tung Nguyen
请问您能否在这里给我指导一下:https://stackoverflow.com/questions/62510899/spring-batch-create-two-datasources-and-how-to-customized-to-use-other-propert? - PAA

5
根据 https://docs.spring.io/spring-boot/docs/current/reference/htmlsingle/#howto-two-datasources 文档:
@Bean
@Primary
@ConfigurationProperties("app.datasource.first")
public DataSourceProperties firstDataSourceProperties() {
    return new DataSourceProperties();
}

@Bean
@Primary
@ConfigurationProperties("app.datasource.first")
public DataSource firstDataSource() {
    return firstDataSourceProperties().initializeDataSourceBuilder().build();
}

@Bean
@ConfigurationProperties("app.datasource.second")
public DataSourceProperties secondDataSourceProperties() {
    return new DataSourceProperties();
}

@Bean
@ConfigurationProperties("app.datasource.second")
public DataSource secondDataSource() {
    return secondDataSourceProperties().initializeDataSourceBuilder().build();
}

在应用程序属性中,您可以使用常规数据源属性:
app.datasource.first.type=com.zaxxer.hikari.HikariDataSource
app.datasource.first.maximum-pool-size=30

app.datasource.second.url=jdbc:mysql://localhost/test
app.datasource.second.username=dbuser
app.datasource.second.password=dbpass
app.datasource.second.max-total=30

5
如果您的Spring Boot版本为2.2.0或更高版本,请将@BatchDataSource添加到批处理数据源中。此注释的详细信息如下:
/**
 * Qualifier annotation for a DataSource to be injected into Batch auto-configuration. Can
 * be used on a secondary data source, if there is another one marked as
 * {@link Primary @Primary}.
 *
 * @author Dmytro Nosan
 * @since 2.2.0
 */
@Target({ ElementType.FIELD, ElementType.METHOD, ElementType.PARAMETER, ElementType.TYPE, ElementType.ANNOTATION_TYPE })
@Retention(RetentionPolicy.RUNTIME)
@Documented
@Qualifier
public @interface BatchDataSource {

}

例如:

@BatchDataSource
@Bean("batchDataSource")
public DataSource batchDataSource(@Qualifier("batchDataSourceProperties") DataSourceProperties dataSourceProperties) {
        return dataSourceProperties
                .initializeDataSourceBuilder()
                .type(HikariDataSource.class)
                .build();
}

最佳解决方案。 即使模式生成也在工作: spring.batch.jdbc.initialize-schema=ALWAYS - undefined

4

根据Frozen在他的答案中的建议,我使用了两个DataSources。此外,我还需要定义一个BatchDataSourceInitializer来正确初始化批处理DataSource,正如Michael Minella在这个相关问题的回答中所建议的。

DataSource配置

@Configuration
public class DataSourceConfiguration {

    @Bean
    @Primary
    @ConfigurationProperties("domain.datasource")
    public DataSource domainDataSource() {
        return DataSourceBuilder.create().build();
    }

    @Bean("batchDataSource")
    @ConfigurationProperties("batch.datasource")
    public DataSource batchDataSource() {
        return DataSourceBuilder.create().build();
    }
}

批量配置

@Configuration
@EnableBatchProcessing
public class BatchConfiguration extends DefaultBatchConfigurer {

    @Override
    @Autowired
    public void setDataSource(@Qualifier("batchDataSource") DataSource batchDataSource) {
        super.setDataSource(batchDataSource);
    }

    @Bean
    public BatchDataSourceInitializer batchDataSourceInitializer(@Qualifier("batchDataSource") DataSource batchDataSource,
            ResourceLoader resourceLoader) {
        return new BatchDataSourceInitializer(batchDataSource, resourceLoader, new BatchProperties());
    }

application.properties:

# Sample configuraion using a H2 in-memory DB
domain.datasource.jdbcUrl=jdbc:h2:mem:domain-ds;DB_CLOSE_DELAY=-1;DB_CLOSE_ON_EXIT=FALSE
domain.datasource.username=sa
domain.datasource.password=
domain.datasource.driver=org.h2.Driver

batch.datasource.jdbcUrl=jdbc:h2:mem:batch-ds;DB_CLOSE_DELAY=-1;DB_CLOSE_ON_EXIT=FALSE
batch.datasource.username=sa
batch.datasource.password=
batch.datasource.driver=org.h2.Driver

3

你已经尝试过类似这样的东西吗?

@Bean(name="batchDataSource")
public DataSource batchDataSource(){          
       return DataSourceBuilder.create()
                .url(env.getProperty("batchdb.url"))
                .driverClassName(env.getProperty("batchdb.driver"))
                .username(env.getProperty("batchdb.username"))
                .password(env.getProperty("batchdb.password"))
                .build();          
} 

然后使用@Primary标记另一个数据源,并在批处理配置中使用@Qualifier指定您想要自动装配batchDataSource bean。


请问您能否在这里给我指导一下:https://stackoverflow.com/questions/62510899/spring-batch-create-two-datasources-and-how-to-customized-to-use-other-propert? - PAA

1
假设您有两个数据源,一个用于Spring Batch元数据,例如作业详情[假设为CONFIGDB],另一个用于您的业务数据[假设为AppDB]:
将CONFIGDB注入到jobRepository中,如下所示:
 <bean id="jobRepository"
    class="org.springframework.batch.core.repository.support.JobRepositoryFactoryBean">
    <property name="transactionManager" ref="transactionManager" />
    <property name="dataSource" ref="CONFIGDB" />
    <property name="databaseType" value="db2" />
    <property name="tablePrefix" value="CONFIGDB.BATCH_" />
  </bean>

现在您可以将AppDB dartasource注入到DAO的OR Writers(如果有)中,例如...
   <bean id="DemoItemWriter" class="com.demoItemWriter">
     <property name="dataSource" ref="AppDB" />     
   </bean>

或者,您可以定义一个资源,并在需要它的类中使用jndi查找将其注入到AppDB中,例如:

public class ExampleDAO {

@Resource(lookup = "java:comp/env/jdbc/AppDB")
DataSource ds;

}


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接