Talend Open Studio and Kerberos authentication integration

Recently I was working to integrate Talend Open Studio into our Windows domain environment via Kerberos authentication. Rather then use the built in tMSSQLConnection component which uses the jTDS driver behind the scenes. I have decided to use the native JDBC Microsoft SQL SERVER JDBC Microsoft SQL SERVER driver.. The Microsoft JDBC driver supports more features then the tMSSqlConnection_1 connector. For example, MSSQL SERVER Always On is supported with the parameter multiSubnetFailover=true or routing to the readonly version of a database applicationIntent=ReadOnly.

Connection string for Active Directory kerberos authenticaiton:



However, one of the requirements of Kerberos authentication is that you acquire a ticket from the key distribution center (KDC) server. If you have not acquired a ticket from the KDC server you will receive an error from Talend Open studio like the one below.

Starting job cdc_test at 13:54 12/08/2015.

[statistics] connecting to socket on port 4019
[statistics] connected
Exception in component tJDBCConnection_1
com.microsoft.sqlserver.jdbc.SQLServerException: Integrated authentication failed. ClientConnectionId:219f9461-59d8-41fa-8af6-483f4e2a86bd
	at com.microsoft.sqlserver.jdbc.SQLServerConnection.terminate(SQLServerConnection.java:1668)
	at com.microsoft.sqlserver.jdbc.KerbAuthentication.intAuthInit(KerbAuthentication.java:140)
	at com.microsoft.sqlserver.jdbc.KerbAuthentication.GenerateClientContext(KerbAuthentication.java:268)
	at com.microsoft.sqlserver.jdbc.SQLServerConnection.sendLogon(SQLServerConnection.java:2709)
	at com.microsoft.sqlserver.jdbc.SQLServerConnection.logon(SQLServerConnection.java:2243)
	at com.microsoft.sqlserver.jdbc.SQLServerConnection.access$000(SQLServerConnection.java:42)
	at com.microsoft.sqlserver.jdbc.SQLServerConnection$LogonCommand.doExecute(SQLServerConnection.java:2229)
	at com.microsoft.sqlserver.jdbc.TDSCommand.execute(IOBuffer.java:5696)
	at com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(SQLServerConnection.java:1716)
	at com.microsoft.sqlserver.jdbc.SQLServerConnection.connectHelper(SQLServerConnection.java:1327)
	at com.microsoft.sqlserver.jdbc.SQLServerConnection.login(SQLServerConnection.java:992)
	at com.microsoft.sqlserver.jdbc.SQLServerConnection.connect(SQLServerConnection.java:828)
	at com.microsoft.sqlserver.jdbc.SQLServerDriver.connect(SQLServerDriver.java:1012)
	at java.sql.DriverManager.getConnection(DriverManager.java:571)
	at java.sql.DriverManager.getConnection(DriverManager.java:215)
	at myfirsttalendjob.cdc_test_0_1.cdc_test.tJDBCConnection_1Process(cdc_test.java:552)
	at myfirsttalendjob.cdc_test_0_1.cdc_test$4.run(cdc_test.java:3492)
Caused by: javax.security.auth.login.LoginException: Unable to obtain Princpal Name for authentication 
	at com.sun.security.auth.module.Krb5LoginModule.promptForName(Krb5LoginModule.java:800)
	at com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:671)
	at com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:584)
	at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at javax.security.auth.login.LoginContext.invoke(LoginContext.java:762)
	at javax.security.auth.login.LoginContext.access$000(LoginContext.java:203)
	at javax.security.auth.login.LoginContext$4.run(LoginContext.java:690)
	at javax.security.auth.login.LoginContext$4.run(LoginContext.java:688)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:687)
	at javax.security.auth.login.LoginContext.login(LoginContext.java:595)
	at com.microsoft.sqlserver.jdbc.KerbAuthentication.intAuthInit(KerbAuthentication.java:133)
	... 15 more
[statistics] disconnected
Job cdc_test ended at 13:54 12/08/2015. [exit code=1]

To work around this issue I created a Windows batch file. When I start up Talend using the batch file it asks me for my Windows password. I enter my Windows Active Directory password and a ticket is created from the KDC.

c:\program files\java\jre7\bin\kinit.exe tatroc@LAB.NET
C:\Talend_Ent\Talend-Studio-20141207_1530-V5.6.1\Talend-Studio-win-x86_64.exe -vm C:\Program Files\Java\jdk1.7.0_75\bin

The connection to the MSSQL SERVER using kerberos is now successful.
Talend Platform for Data Services with Big Data ( _ myfirstt_2015-08-12_14-34-22

MySQL Performance Schema

The MySQL Performance Schema is an extremely valuable feature in MySQL 5.6 for finding queries which have a performance impact on the database. However, the performance schema is not enabled by default. The performance schema can be enabled through several different methods. Possible reasons why enabling the performance schema is a bad thing includes performance impact on the server. Percona has a good write up on the performance impact of enabling the Performance Schema. Therefore be careful when enabling the performance schema in a production environment as it can add a significant amount of overhead.

Setting the performance_schema directives in the mn.cnf file will ensure that the performance_schema is enabled at startup. One can also enable the performance_schema by running SQL queries after MySQL is up and running. This can be useful if you want to disable and enable the performance_schema to troubleshoot performance issues then turn them back off without having to restart MySQL.

vim /etc/my.cnf



Consumers are setup in the setup_consumers table this specified where the data will go.

select * from performance_schema.setup_consumers;

MySQL Workbench_2015-08-11_07-12-25

View which objects have the performance schema setup on them.
The screen shot below shows that performance stats will be collected for all databases except for mysql, Performance_schema and information_schema.

The setup_actors table determines who to monitor, you can filter by host, user or role.

To view the current instruments which are enabled run the following query.

select * from setup_instruments

Instruments can be enabled by running an UPDATE statement on the table.

update setup_instruments
set enabled = 'yes', timed = 'yes'
where name = 'statement/sql/lock_tables'


After the performance_schema has been configured you will be able to view the sys tables for statistics collected on queries.

select * 
from sys.`x$statements_with_full_table_scans`

Here it is important to note the difference between the average rows sent and the average rows examined. This means that the query optimizer is examining many rows but only sending a a small number. If there is a large difference between the two. This may indicate that their is a need for an index.

MySQL Workbench_2015-08-11_08-06-52

Additional performance tables can be viewed which present statistics on number of temp tables created.

select * 
from sys.`x$statements_with_temp_tables`
select * 
from sys.`x$statement_analysis`

Reference: For more in depth informaiton on the the performance schema and A Visual Guide to the MySQL Performance Schema