Quantcast
Channel: Troubleshooting – Help: SQL Server
Viewing all 14 articles
Browse latest View live

Solution : The connection to the primary replica is not active. The command cannot be processed.

$
0
0

It has been close to a year since I published my first book (SQL Server 2012 AlwaysOnPaperback, Kindle) and since then I have been contacted by many DBA to troubleshoot various issue related to AlwaysOn Availability Groups. One of the most common error which I have seen is as below.

Msg 35250, Level 16, State 7, Line 1
The connection to the primary replica is not active. The command cannot be processed.

This error mostly appears when we try to join the database to availability group. by UI, T-SQL or PowerShell.

SSMS UI:

While trying to create new Availability Group, we might received below and “join” step would fail.

image

Here is the message in text format.

TITLE: Microsoft SQL Server Management Studio
——————————
Joining database on secondary replica resulted in an error.  (Microsoft.SqlServer.Management.HadrTasks)
——————————
ADDITIONAL INFORMATION:
Failed to join the database ‘Production’ to the availability group ‘ProductionAG’ on the availability replica ‘SRV2′. (Microsoft.SqlServer.Smo)
For help, click: http://go.microsoft.com/fwlink?ProdName=Microsoft+SQL+Server&ProdVer=11.0.2100.60+((SQL11_RTM).120210-1917+)&EvtSrc=Microsoft.SqlServer.Management.Smo.ExceptionTemplates.FailedOperationExceptionText&LinkId=20476
——————————
An exception occurred while executing a Transact-SQL statement or batch. (Microsoft.SqlServer.ConnectionInfo)
——————————
The connection to the primary replica is not active.  The command cannot be processed. (Microsoft SQL Server, Error: 35250)
For help, click: http://go.microsoft.com/fwlink?ProdName=Microsoft%20SQL%20Server&ProdVer=11.00.2100&EvtSrc=MSSQLServer&EvtID=35250&LinkId=20476
——————————
BUTTONS:
OK
——————————

T-SQL:

image

Msg 35250, Level 16, State 7, Line 1
The connection to the primary replica is not active. The command cannot be processed.

PowerShell:

Add-SqlAvailabilityDatabase -Path "SQLSERVER:\SQL\SRV2\DEFAULT\AvailabilityGroups\ProductionAG" -Database "Production"

**********************************
Add-SqlAvailabilityDatabase : The connection to the primary replica is not active.  The command cannot be processed.
At line:1 char:28
+ Add-SqlAvailabilityDatabase <<<<  -Path "SQLSERVER:\SQL\SRV2\DEFAULT\AvailabilityGroups\ProductionAG" -Database "Production"
     + CategoryInfo          : InvalidOperation: (:) [Add-SqlAvailabilityDatabase], SqlException
     + FullyQualifiedErrorId : ExecutionFailed,Microsoft.SqlServer.Management.PowerShell.Hadr.AddSqlAvailabilityGroupDatabaseCommand
**********************************

Solution

I have always suggested them to start looking at errorlog and check what is the error which most of the DBA have reported.

2014-06-30 17:29:33.500 Logon        Database Mirroring login attempt by user ‘HADOMAIN\SRV1$.’ failed with error: ‘Connection handshake failed. The login ‘HADOMAIN\SRV1$’ does not have CONNECT permission on the endpoint. State 84.’.  [CLIENT: 192.168.1.11]

In above message, HADOMAIN is my domain name and SRV1 is the host name of SQL Server hosting primary replica.

Here is what have solved the issue for them.

  • Change SQL Server service account to a domain account and grant connect permission to the instances. If we are using different domain accounts on each replica then we need to add service accounts of all secondary replicas to primary replica logins.
  • If we are using non domain account (like LocalSystem or NT Service\MSSQLServer account) as service account and we can’t change it to domain account then we need to create machine accounts as login and grant connect permission. In our case machine name is SRV1 so machine account is HADOMAIN\SRV1$ (notice that $ at the end is a computer account)

     

    create login [HADOMAIN\SRV1$] from windows;
    go
    grant connect on endpoint::Mirroring to [HADOMAIN\SRV1$];
    go

Note: Endpoint Name might be different. We need to pick as per below image: If you have configured via UI earlier, it should be Hadr_endpoint

image

If you are running firewall, please make sure that port used by availability group is not blocked. We can easily find port using below command:

SELECT
te.port AS [ListenerPort],
te.is_dynamic_port AS [IsDynamicPort],
ISNULL(te.ip_address,'''') AS [ListenerIPAddress],
CAST(case when te.endpoint_id < 65536 then 1 else 0 end AS bit) AS [IsSystemObject]
FROM
sys.endpoints AS e
INNER JOIN sys.tcp_endpoints AS te ON te.endpoint_id=e.endpoint_id
image

Make sure that you have added exception for the port in firewall.

This is already documented in books online

{

If any server instances that are hosting the availability replicas for an availability group run as different accounts, the login each account must be created in master on the other server instance. Then, that login must be granted CONNECT permissions to connect to the database mirroring endpoint of that server instance.

}

Hope this would help you.

  • Cheers,
  • Balmukund Lakhani
  • Twitter @blakhani
  • Author: SQL Server 2012 AlwaysOnPaperback, Kindle


  • Help : How to find cause of “Login failed for user” error

    $
    0
    0

    “Login failed for user” is one of the most common error which everyone might have seen at least once. In this blog, I am going to share few possible causes of the error and their solution. First, we need to understand that due to security reasons, SQL Server doesn’t send more information about this error message to client. Here is what client would see in all situations (I have done it from SSMS)

    TITLE: Connect to Server
    ——————————
    Cannot connect to .\SQL2014.
    ——————————
    ADDITIONAL INFORMATION:
    Login failed for user ‘sa’. (Microsoft SQL Server, Error: 18456)
    For help, click: http://go.microsoft.com/fwlink?ProdName=Microsoft%20SQL%20Server&EvtSrc=MSSQLServer&EvtID=18456&LinkId=20476
    ——————————
    BUTTONS:
    OK
    ——————————

     

    If we click on more details, we would see this

    ——————————
    Server Name: .\SQL2014
    Error Number: 18456
    Severity: 14
    State: 1
    Line Number: 65536

    This error doesn’t tell “exact” reason about login failed for user. SQL Server deliberately hides the nature of the authentication error and gives State 1.

    The very first thing which I always ask is look at ERRORLOG and find error at exact same time. By default, auditing of failed logins is enabled. In this case, the true state of the 18456 error is reported in the SQL Server Errorlog file. You can verify them by Right Click on Server node > Properties > Security and check

    image

    Here is what we would see in ERRORLOG is failed login auditing (option 1 and 3 in above image) is enabled.

    2014-07-08 04:37:07.910 Logon        Error: 18456, Severity: 14, State: 8.
    2014-07-08 04:37:07.910 Logon        Login failed for user ‘sa’. Reason: Password did not match that for the login provided. [CLIENT: <local machine>]

    It is important to note that in earlier version of SQL Server (before SQL 2008), the error message was something like below.

    2014-07-08 04:37:07.910 Logon        Error: 18456, Severity: 14, State: 8.
    2014-07-08 04:37:07.910 Logon        Login failed for user ‘sa’. [CLIENT: <local machine>]

    Notice that reason was not shown in earlier version of SQL in ERRORLOG. In those days, we used to keep a track of all states and their meanings but thankfully, I don’t need it any more. Here is the quick cheat sheet which I have saved. You can easily reproduce them. I have given meaningful login name.

    State 5: Incorrect login name provided.

    2014-07-08 05:35:48.850 Logon        Error: 18456, Severity: 14, State: 5.
    2014-07-08 05:35:48.850 Logon        Login failed for user ‘InvalidLogin’. Reason: Could not find a login matching the name provided. [CLIENT: <local machine>]

    State 7: Account disabled AND incorrect password.

    2014-07-08 05:41:30.390 Logon        Error: 18456, Severity: 14, State: 7.
    2014-07-08 05:41:30.390 Logon        Login failed for user ‘DisabledLogin’. Reason: An error occurred while evaluating the password. [CLIENT: <local machine>]

    Notice that state 7 would come if account is disabled and incorrect password is supplied. If correct password is supplied for disabled login then client would get below error (18470)

    2014-07-08 05:41:22.950 Logon        Error: 18470, Severity: 14, State: 1.
    2014-07-08 05:41:22.950 Logon        Login failed for user ‘DisabledLogin’. Reason: The account is disabled. [CLIENT: <local machine>]

    State 58: SQL account used to attempt to login on “Windows Only” authentication mode SQL instance.

    2014-07-08 06:21:36.780 Logon        Error: 18456, Severity: 14, State: 58.
    2014-07-08 06:21:36.780 Logon        Login failed for user ‘sa’. Reason: An attempt to login using SQL authentication failed. Server is configured for Windows authentication only. [CLIENT: <local machine>]

    Above can be fixed by this blog by Pinal (b|t|f)

    State 11: SQL login account getting deny permission via some group membership.

    2014-07-08 06:21:36.780 Logon        Error: 18456, Severity: 14, State: 11.
    2014-07-08 06:21:36.780 Logon        Login failed for user ‘SQL_Login’. Reason: Token-based server access validation failed with an infrastructure error. Check for previous errors. [CLIENT: <local machine>]

    State 12: Windows login account getting deny permission via some group membership or UAC.

    2014-07-08 06:21:34.840 Logon        Error: 18456, Severity: 14, State: 12.
    2014-07-08 06:21:34.840 Logon        Login failed for user ‘domain\user’. Reason: Token-based server access validation failed with an infrastructure error. Check for previous errors. [CLIENT: <local machine>]

    To solve state 11 and 12, it is important to find how we are getting deny permission for that account. In most of the cases, it might be due to UAC and this should work but it’s only for local connections.

    GRANT CONNECT SQL TO [DOMAIN\User]

    In few cases, we can run below query to find if any group has deny permission.

    SELECT prin.[name]
        ,prin.type_desc
    FROM sys.server_principals prin
    INNER JOIN sys.server_permissions PERM ON prin.principal_id = PERM.grantee_principal_id
    WHERE PERM.state_desc = 'DENY'
    

     

    If you find any other state, post in comment, I would enhance the blog further and add their cause.

    Hope this helped!

  • Cheers,
  • Balmukund Lakhani
  • Twitter @blakhani
  • Author: SQL Server 2012 AlwaysOnPaperback, Kindle

  • Troubleshooting : Slow Delete Database from Management Studio

    $
    0
    0

    Not very long ago, I had a database on my SQL 2012 Instance which was getting log shipped happily at frequency of 1 minute. Long long ago I have done such configuration for a demo purpose and forgot. Today I had “Memory Recall” when space was getting filled up with log backups. Since the demo was complete, I decided to drop the database. So I broke log shipping and tried dropping the database. What you do as a DBA to drop a database? Right Click > Delete .. huh?

    image

    When I clicked on “OK” button it was taking a long time. GUI to seem like its hanging. As usual, troubleshooting started! Ran my standard troubleshooting query to find out what is going on.

     SELECT s.session_id
        ,r.STATUS
        ,r.blocking_session_id 'Blk by'
        ,r.wait_type
        ,wait_resource
        ,r.wait_time / (1000.0) 'Wait Sec'
        ,r.cpu_time
        ,r.logical_reads
        ,r.reads
        ,r.writes
        ,r.total_elapsed_time / (1000.0) 'Elaps Sec'
        ,Substring(st.TEXT, (r.statement_start_offset / 2) + 1, (
                (
                    CASE r.statement_end_offset
                        WHEN - 1
                            THEN Datalength(st.TEXT)
                        ELSE r.statement_end_offset
                        END - r.statement_start_offset
                    ) / 2
                ) + 1) AS statement_text
        ,Coalesce(Quotename(Db_name(st.dbid)) + N'.' + Quotename(Object_schema_name(st.objectid, st.dbid)) + N'.' + Quotename(Object_name(st.objectid, st.dbid)), '') AS command_text
        ,r.command
        ,s.login_name
        ,s.host_name
        ,s.program_name
        ,s.host_process_id
        ,s.last_request_end_time
        ,s.login_time
        ,r.open_transaction_count
    FROM sys.dm_exec_sessions AS s
    INNER JOIN sys.dm_exec_requests AS r ON r.session_id = s.session_id
    CROSS APPLY sys.dm_exec_sql_text(r.sql_handle) AS st
    WHERE r.session_id != @@SPID
    ORDER BY r.cpu_time DESC
        ,r.STATUS
        ,r.blocking_session_id
        ,s.session_id

    Here was the result (I have removed few columns to avoid clutter)

    session_id status cpu_time logical_reads writes Elaps Sec statement_text command_text
    60 runnable 247734 42732577 29196 545.223 DELETE msdb.dbo.backupmediaset

    FROM msdb.dbo.backupmediaset bms

    WHERE bms.media_set_id IN (SELECT media_set_id

         FROM @media_set_id)

        AND ((SELECT COUNT(*)

      FROM msdb.dbo.backupset

      WHERE media_set_

    [msdb].[dbo].[sp_delete_database_backuphistory]

     

    Why would delete database do that? Well, it’s done by a small little checkbox which we never noticed.

    image

    That little checkbox executed this command (along with drop database). If we use “Script” button, this is the outcome

    EXEC msdb.dbo.sp_delete_database_backuphistory @database_name = N'AdventureWorks2014'
    GO
    USE [master]
    GO
    DROP DATABASE [AdventureWorks2014]
    GO
    
    

    Now we know why it’s taking time but can this be made faster? Well, I check msdb database and there are few indexes which have been added in SQL Server 2014 which would help in this situation. Here is the quick comparison.

    Select @@version
    go
    SELECT 
         TableName = t.name,
         IndexName = ind.name,
         ColumnName = col.name
    FROM 
         sys.indexes ind 
    INNER JOIN 
         sys.index_columns ic ON  ind.object_id = ic.object_id and ind.index_id = ic.index_id 
    INNER JOIN 
         sys.columns col ON ic.object_id = col.object_id and ic.column_id = col.column_id 
    INNER JOIN 
         sys.tables t ON ind.object_id = t.object_id 
    WHERE 
         ind.is_primary_key = 0 
         AND t.is_ms_shipped = 1
         AND t.name in ( 'backupfile', 'backupfilegroup', 'backupmediafamily', 'backupmediaset', 'backupset', 'restorefile', 'restorefilegroup', 
    'restorehistory' ) ORDER BY t.name, ind.name, ind.index_id, ic.index_column_id

    image

    image

    If you are facing the same problem which I described on SQL 2008 or SQL 2012, you may want to try creating new indexes as advised on other blogs (search for “msdb performance tuning” in bing/google) but my only suggestion is that it might be unsupported.

    If you clean msdb backup history regularly, you might not face the issue though. There is maintenance plan to do that. Try it out!

    Hope this helps.

  • Cheers,
  • Balmukund Lakhani
  • Twitter @blakhani
  • Author: SQL Server 2012 AlwaysOnPaperback, Kindle

  • Tips and Tricks : OS error: 32(The process cannot access the file because it is being used by another process.).

    $
    0
    0

    While playing with tempDB database on my machine, I have made some mistake and then was not able to start one SQL Instance. As usual, started troubleshooting and used sysinternals tool to find the cause of the problem.

    First, I looked into ERRORLOG and found below messages. I have highlighted some text for clarity.

    2014-08-07 05:53:44.13 spid11s     Clearing tempdb database.
    2014-08-07 05:53:44.40 spid11s     Error: 5123, Severity: 16, State: 1.
    2014-08-07 05:53:44.40 spid11s     CREATE FILE encountered operating system error 32(The process cannot access the file because it is being used by another process.) while attempting to open or create the physical file ‘E:\TempDB\tempdb.mdf’.

    2014-08-07 05:53:45.42 spid11s     Error: 17204, Severity: 16, State: 1.
    2014-08-07 05:53:45.42 spid11s     FCB::Open failed: Could not open file E:\TempDB\tempdb.mdf for file number 1.  OS error: 32(The process cannot access the file because it is being used by another process.).
    2014-08-07 05:53:45.43 spid11s     Error: 5120, Severity: 16, State: 101.
    2014-08-07 05:53:45.43 spid11s     Unable to open the physical file "E:\TempDB\tempdb.mdf". Operating system error 32: "32(The process cannot access the file because it is being used by another process.)".
    2014-08-07 05:53:45.46 spid11s     Error: 1802, Severity: 16, State: 4.
    2014-08-07 05:53:45.46 spid11s     CREATE DATABASE failed. Some file names listed could not be created. Check related errors.
    2014-08-07 05:53:45.46 spid11s     Could not create tempdb. You may not have enough disk space available. Free additional disk space by deleting other files on the tempdb drive and then restart SQL Server. Check for additional errors in the event log that may indicate why the tempdb files could not be initialized.
    2014-08-07 05:53:45.46 spid11s     SQL Trace was stopped due to server shutdown. Trace ID = ‘1’. This is an informational message only; no user action is required.
    2014-08-07 05:53:49.68 Logon       Error: 17188, Severity: 16, State: 1.
    2014-08-07 05:53:49.68 Logon       SQL Server cannot accept new connections, because it is shutting down. The connection has been closed. [CLIENT: <local machine>]

    Due to OS Error 32, SQL was not able to use files which are needed by TempDB database and unable to start.

    Next task for us would be to find out which is that “another process”. If it’s an open handle by a use mode process we would be able to find out using Process Explorer. Once you download and run it, we can see something like below.

    image

    Then press Ctrl+F or use Menu option “Find” > “Find Handle or DLL” as shown below

    image

    In the find window provide file name with complete path and search. I was able to get below

    image

    Now, since I know that “another process” I can take the action which is suitable. This happened to me because my two instances are pointing to same location for tempdb database files. I have rectified them on non-starting instance by below steps.

    1. Started SQL Server using “Net Start MSSQL$SQL2014 /mSQLCMD /f /T3608”
    2. Connected to SQL via SQLCMD –S(local)\SQL2014
    3. Executed below T-SQL

    USE master; 
    GO 
    ALTER DATABASE tempdb 
    MODIFY FILE (NAME = tempdev, FILENAME = 'E:\Program Files\Microsoft SQL Server\MSSQL12.SQL2014\MSSQL\DATA\tempdb.mdf'); 
    GO 
    ALTER DATABASE tempdb 
    MODIFY FILE (NAME = templog, FILENAME = 'E:\Program Files\Microsoft SQL Server\MSSQL12.SQL2014\MSSQL\DATA\tempdb.ldf'); 
    GO 

     

    4. Stopped SQL via “net stop MSSQL$SQL2014”

    5. Started SQL normally.

    image

    (Click on Image to enlarge)

    Hope this would help you in troubleshooting OS Error 32 for other application as well.

  • Cheers,
  • Balmukund Lakhani
  • Twitter @blakhani
  • Author: SQL Server 2012 AlwaysOnPaperback, Kindle

  • Tips and Tricks : Error: 5171 – tempdb.mdf is not a primary database file

    $
    0
    0

    If you are getting same error for database other than tempdb then there is a serious issue with the file. Primary file is a database file which contains information about database itself like location, size of other files and other information about the database. Error 5171 means that SQL Server is attempting to get the information for a database from a file that is not the primary file.

    While doing some testing with TempDB database I started getting below errors in ERRORLOG and SQL Server was not getting started.

    2014-08-12 05:08:24.91 spid9s      Clearing tempdb database.

    2014-08-12 05:08:28.20 spid9s      Error: 5171, Severity: 16, State: 1.

    2014-08-12 05:08:28.20 spid9s      F:\TEMPDB\tempdb.mdf is not a primary database file.

    2014-08-12 05:08:28.26 spid9s      Error: 1802, Severity: 16, State: 4.

    2014-08-12 05:08:28.26 spid9s      CREATE DATABASE failed. Some file names listed could not be created. Check related errors.

    2014-08-12 05:08:28.26 spid9s      Could not create tempdb. You may not have enough disk space available. Free additional disk space by deleting other files on the tempdb drive and then restart SQL Server. Check for additional errors in the event log that may indicate why the tempdb files could not be initialized.

    2014-08-12 05:08:28.29 spid9s      SQL Server shutdown has been initiated

     

    This started happening after I moved TempDB to new location using my own earlier blog. Here is the command which I have run

    USE master; 
    GO 
    ALTER DATABASE tempdb 
    MODIFY FILE (NAME = tempdev, FILENAME = 'F:\TEMPDB\tempdb.mdf'); 
    GO 
    ALTER DATABASE tempdb 
    MODIFY FILE (NAME = templog, FILENAME = 'F:\TEMPDB\tempdb.mdf'); 
    GO 
    
    

    If you notice closely, I have made mistake in extension of the files and due to which both files are same. This can easily be corrected by starting SQL in minimal configuration using parameter f and correcting the path.

    When I tried the same in SQL Server 2014, I got below error message, which is amazing.

    Msg 12106, Level 16, State 1, Line 6

    The path name ‘F:\TEMPDB\tempdb.mdf’ is already used by another database file. Change to another valid and UNUSED name.

    If this is happening for database other than TempDB after moving then you may want to check if move command was proper or not. You need to check logical name and the physical file path. If this is after some crash then you may need to restore from a last known good backup. If you don’t have backup then … you need to find a new assignment! Take this as a new lesson and move on. There are data recovery tools available but I have not worked with them and can’t recommend anyone.

    Hope this helps.

  • Cheers,
  • Balmukund Lakhani
  • Twitter @blakhani
  • Author: SQL Server 2012 AlwaysOnPaperback, Kindle

  • Troubleshooting : %1 is not a valid win32 application

    $
    0
    0

    As part of my job, a significant part of my working hours are all about troubleshooting SQL related issues. They could range from SQL installation, performance, high availability, T-SQL query and pretty much any area related to SQL Server.

    Few days back one of my friend pinged me and told that he is not able to start “Reporting Service” service. I asked if he is seeing any error message anywhere like event logs. He shared below with me.

    image

    Here is the text of the message.

    Service cannot be started. System.Exception: Default appdomain failed to initialize.

       at Microsoft.ReportingServices.Library.ServiceAppDomainController.Start()

       at Microsoft.ReportingServices.Library.ReportService.OnStart(String[] args)

       at System.ServiceProcess.ServiceBase.ServiceQueuedMainCallback(Object state)

    That was not a good message to tell us what’s wrong. I researched further and found that similar to ERRORLOG in SQL Server, there are logs for Reporting Services as well. Here is the more detailed messaged in Reporting Services log.

    image

    configmanager!DefaultDomain!e10!08/16/2014-01:52:29:: e ERROR: Error loading configuration file: %1 is not a valid Win32 application

    library!DefaultDomain!e10!08/16/2014-01:52:29:: e ERROR: Throwing Microsoft.ReportingServices.Diagnostics.Utilities.ServerConfigurationErrorException: , Microsoft.ReportingServices.Diagnostics.Utilities.ServerConfigurationErrorException: The report server has encountered a configuration error.  —> System.ComponentModel.Win32Exception: %1 is not a valid Win32 application

       at Microsoft.ReportingServices.Diagnostics.SafeLibraryHandle.LoadLibrary(String libName)

       at Microsoft.ReportingServices.Diagnostics.SqlInstallation.GetSkuFromSqlBoot(String instanceId, Int32& daysLeft)

       at Microsoft.ReportingServices.Diagnostics.Sku.<>c__DisplayClass6.<GetSkuFromSqlBoot>b__5()

       at Microsoft.ReportingServices.Diagnostics.RevertImpersonationContext.<>c__DisplayClass1.<Run>b__0(Object state)

       at System.Security.SecurityContext.Run(SecurityContext securityContext, ContextCallback callback, Object state)

       at Microsoft.ReportingServices.Diagnostics.RevertImpersonationContext.Run(ContextBody callback)

       at Microsoft.ReportingServices.Diagnostics.Sku.GetSkuFromSqlBoot(String instanceId)

       at Microsoft.ReportingServices.Diagnostics.Sku.GetInstalledSku(String instanceId)

       at Microsoft.ReportingServices.Diagnostics.RSConfiguration.AdjustProperties(ConfigurationPropertyBag properties)

       at Microsoft.ReportingServices.Diagnostics.RSConfiguration.Validate(ConfigurationPropertyBag properties)

       at Microsoft.ReportingServices.Diagnostics.RSConfigurationFileManager.LoadDocument()

       at Microsoft.ReportingServices.Diagnostics.RSConfigurationFileManager.LoadConfiguration()

       — End of inner exception stack trace —;

    appdomainmanager!DefaultDomain!e10!08/16/2014-01:52:29:: e ERROR: Appdomain:1 DefaultDomain failed to initialize. Error: Microsoft.ReportingServices.Diagnostics.Utilities.ServerConfigurationErrorException: The report server has encountered a configuration error.  —> System.ComponentModel.Win32Exception: %1 is not a valid Win32 application.

    appdomainmanager!DefaultDomain!352c!08/16/2014-01:52:29:: e ERROR: Windows service failed to start. Exception: System.Exception: Default appdomain failed to initialize.

       at Microsoft.ReportingServices.Library.ServiceAppDomainController.Start()

    If you are a developer, you would know what a stack is. It goes from bottom to top and shows the section which caused the error. From the highlighted pieces it’s easy to make sense. Reporting Service is trying to get SKU (edition) which is installed on this machine using function GetInstalledSku. After that we are seeing function GetSkuFromSqlBoot which indicates that we will get information using this function. Later, we are seeing LoadLibrary and that function is raising error. Now the question is why! If we do a search on internet using Bingoogle. In general, the most possible cause of the error is corruption of the files which are needed. If we capture ProcMon while starting SSRS Service, it would be easy to find last loaded DLL and then we may need to find if it’s a correct DLL by comparing with another machine where things are working fine.

    Interesting, here is what I saw under “C:\Program Files\Microsoft SQL Server\110\Shared”

    image

    As we can see that someone has renamed the file and original file is renamed as sqlboot.dll.x64.

    When I captured ProcMon

    image

    And after this I saw “exit” of the threads and process.

    Another symptom on the same problem is that when he was running SQL Setup to add some component, he was getting below error

    There was a failure to calculate the default value of setting DIGITALPRODUCTID.

    and this is what we see in setup logs.

    (01) 2014-08-15 10:46:44 Slp: The following is an exception stack listing the exceptions in outermost to innermost order

    (01) 2014-08-15 10:46:44 Slp: Inner exceptions are being indented

    (01) 2014-08-15 10:46:44 Slp:

    (01) 2014-08-15 10:46:44 Slp: Exception type: Microsoft.SqlServer.Chainer.Infrastructure.CalculateSettingValueException

    (01) 2014-08-15 10:46:44 Slp:     Message:

    (01) 2014-08-15 10:46:44 Slp:         There was a failure to calculate the default value of setting DIGITALPRODUCTID.

    (01) 2014-08-15 10:46:44 Slp:     HResult : 0x85640001

    (01) 2014-08-15 10:46:44 Slp:         FacilityCode : 1380 (564)

    (01) 2014-08-15 10:46:44 Slp:         ErrorCode : 1 (0001)

    (01) 2014-08-15 10:46:44 Slp:     Data:

    (01) 2014-08-15 10:46:44 Slp:       SettingId = DIGITALPRODUCTID

    (01) 2014-08-15 10:46:44 Slp:       WatsonData = Microsoft.SqlServer.Chainer.Infrastructure.CalculateSettingValueException@1

    (01) 2014-08-15 10:46:44 Slp:     Stack:

    (01) 2014-08-15 10:46:44 Slp:         at Microsoft.SqlServer.Chainer.Infrastructure.Setting`1.CalculateValue()

    (01) 2014-08-15 10:46:44 Slp:         at Microsoft.SqlServer.Deployment.PrioritizedPublishing.PublishingQueue.CallQueuedSubscriberDelegates()

    (01) 2014-08-15 10:46:44 Slp:         at Microsoft.SqlServer.Deployment.PrioritizedPublishing.PublishingQueue.Publish(Publisher publisher)

    (01) 2014-08-15 10:46:44 Slp:         at Microsoft.SqlServer.Chainer.Infrastructure.Setting`1.set_Value(T value)

    (01) 2014-08-15 10:46:44 Slp:         at Microsoft.SqlServer.Chainer.Infrastructure.Setting`1.SetValue(Object newValue, InputSettingSource source)

    (01) 2014-08-15 10:46:44 Slp:         at Microsoft.SqlServer.Chainer.Infrastructure.InputSettingService.SetSettingValue[T](String settingName, T value, InputSettingSource source)

    (01) 2014-08-15 10:46:44 Slp:         at Microsoft.SqlServer.Configuration.Property`1.SetValueAndSource(Object value, InputSettingSource source)

    (01) 2014-08-15 10:46:44 Slp:         at Microsoft.SqlServer.Configuration.InstallWizard.InstallTypeController.SaveData()

    (01) 2014-08-15 10:46:44 Slp:         at Microsoft.SqlServer.Configuration.InstallWizardFramework.InstallWizardPageHost.PageLeaving(PageChangeReason reason)

    (01) 2014-08-15 10:46:44 Slp:         at Microsoft.SqlServer.Configuration.WizardFramework.UIHost.set_SelectedPageIndex(Int32 value)

    (01) 2014-08-15 10:46:44 Slp:         at Microsoft.SqlServer.Configuration.WizardFramework.NavigationButtons.nextButton_Click(Object sender, EventArgs e)

    (01) 2014-08-15 10:46:44 Slp:         at System.Windows.Forms.Control.OnClick(EventArgs e)

    (01) 2014-08-15 10:46:44 Slp:         at System.Windows.Forms.Button.OnMouseUp(MouseEventArgs mevent)

    (01) 2014-08-15 10:46:44 Slp:         at System.Windows.Forms.Control.WmMouseUp(Message& m, MouseButtons button, Int32 clicks)

    (01) 2014-08-15 10:46:44 Slp:         at System.Windows.Forms.Control.WndProc(Message& m)

    (01) 2014-08-15 10:46:44 Slp:         at System.Windows.Forms.ButtonBase.WndProc(Message& m)

    (01) 2014-08-15 10:46:44 Slp:         at System.Windows.Forms.Button.WndProc(Message& m)

    (01) 2014-08-15 10:46:44 Slp:         at System.Windows.Forms.Control.ControlNativeWindow.WndProc(Message& m)

    (01) 2014-08-15 10:46:44 Slp:         at System.Windows.Forms.NativeWindow.Callback(IntPtr hWnd, Int32 msg, IntPtr wparam, IntPtr lparam)

    (01) 2014-08-15 10:46:44 Slp:     Inner exception type: System.ComponentModel.Win32Exception

    (01) 2014-08-15 10:46:44 Slp:         Message:

    (01) 2014-08-15 10:46:44 Slp:                %1 is not a valid Win32 application.

    (01) 2014-08-15 10:46:44 Slp:                

    (01) 2014-08-15 10:46:44 Slp:         HResult : 0x80004005

    (01) 2014-08-15 10:46:44 Slp:         Error : 193

    (01) 2014-08-15 10:46:44 Slp:         Stack:

    (01) 2014-08-15 10:46:44 Slp:                 at Microsoft.SqlServer.Configuration.Sco.SqlbootModule.get_Handle()

    (01) 2014-08-15 10:46:44 Slp:                 at Microsoft.SqlServer.Configuration.Sco.EditionInfo.GetEditionInfo(String RegistryPath, RegistryView view, UInt32& daysLeft)

    (01) 2014-08-15 10:46:44 Slp:                at Microsoft.SqlServer.Configuration.Sco.EditionInfo.GetEditionInfo(String RegistryPath, RegistryView view)

    (01) 2014-08-15 10:46:44 Slp:                 at Microsoft.SqlServer.Configuration.SetupExtension.SqlEditionSetting`1.GetDefaultSqlEditionInfoValue()

    (01) 2014-08-15 10:46:44 Slp:                 at Microsoft.SqlServer.Configuration.SetupExtension.DigitalProductIdSetting.DefaultValue()

    (01) 2014-08-15 10:46:44 Slp:                 at Microsoft.SqlServer.Deployment.PrioritizedPublishing.PublishingQueue.CallFunctionWhileAutosubscribing[T](SubscriberDelegate subscriberDelegate, Int32 priority, AutosubscribingFunctionDelegate`1 function)

    (01) 2014-08-15 10:46:44 Slp:                 at Microsoft.SqlServer.Chainer.Infrastructure.Setting`1.CalculateValue()

    So we can clearly see that someone has messed around with the files related to SQL. Here also we are seeing SqlbootModule.

    RESOLUTION

    In this situation, I went ahead and named the files correctly and things were fixed. BUT if you get this error “%1 is not a valid Win32 application” you might need to remove and install SQL Server. It’s not always possible to find the cause with the corrupted file and reinstallation would be a faster approach.

    Hope this helps!

  • Cheers,
  • Balmukund Lakhani
  • Twitter @blakhani
  • Author: SQL Server 2012 AlwaysOnPaperback, Kindle

  • Help : Getting error – Cannot resolve the collation conflict

    $
    0
    0

    Recently someone posted on MSDN forum about collation related error which I have seen multiple times. Today I am taking time to write notes about that. Here is the famous collation conflict error which every DBA would encounter at least once in their career.

    Msg 468, Level 16, State 9, Line 29

    Cannot resolve the collation conflict between "SQL_Latin1_General_CP1_CS_AS" and "SQL_Latin1_General_CP1_CI_AS" in the equal to operation.

    Let’s understand the error first. There are two collation in the error message. If you read in a flick they look similar but they are not. First one has CS (stands for case sensitive) and second one has CI (stands for case insensitive). From the message it’s clear that SQL can’t convert value implicitly between these two collations. Next action would be to you capture a profiler trace or some other troubleshooting technique to identify offending query. In other words, we get this error when queries use two or more columns with different collations for join and comparison purposes. So there could be two possibilities of getting this error message:

    1. Columns are in two different databases that have a different default collation.
    2. Columns are in same database that have collation explicitly specified as different.

    In most of the cases, this error has been caused when database was moved from one server to another server and this falls under first category. Here is the quick script to get same error.

     

    CREATE DATABASE [CaseSensitiveCollation]
     COLLATE SQL_Latin1_General_CP1_CS_AS
    go
    Use [CaseSensitiveCollation]
    go
    Create table MyTableInCaseSensitiveDatabase (vc varchar(100))
    go
    insert into MyTableInCaseSensitiveDatabase
    values ('sysobjects')
    go
    SELECT
    * FROM master.sys.objects a INNER JOIN CaseSensitiveCollation.dbo.MyTableInCaseSensitiveDatabase b ON a.NAME = b.vc

    Here is the screenshot.

     

    image

     

    The collation of master database is SQL_Latin1_General_CP1_CI_AS and for column vc in the table its different and hence the error.  Now, how do you resolve this error? We have multiple options.

     

    • Help SQL Server in identifying which column we want to convert to different collation.
    SELECT * 
    FROM   master.sys.objects a 
           INNER JOIN CaseSensitiveCollation.dbo.MyTableInCaseSensitiveDatabase b 
                   ON a.name = b.vc   collate SQL_Latin1_General_CP1_CI_AS
    • If this is due to restore of the database from different server then check server collation of source server. In this situation, we might have to rebuild the target server to match the source server for the collation. This is as good as reinstalling SQL Server.
    • If we are getting this error for a database which is created as a part of a product installation then we should review the product documentation for details on supported collations.

    I have seen few DBA suggesting to change database collation. It is important to understand that column collation is specified during creation of table. We can find that using below

    image

    Even if we alter the database the collation of already created table would NOT change. Only way to change the collation of existing tables is

    • Move the data to a new table with new collation.
    • Get the script of the table and create same index, stats etc. on new tables.
    • Drop the old table
    • Rename new table as old table. 

    If error is due to temporary tables created in tempdb database then you need to give a thought to contained database feature. Other thing Another option would be to provide the column level collation while creating table as below.

    CREATE TABLE #SQLServerHelp
       (iPK int PRIMARY KEY,
        nCol nchar COLLATE SQL_Latin1_General_CP1_CS_AS
       );
    

    Here are other error which you might get.

    • Cannot resolve collation conflict between "%ls" and "%ls" in %ls operator for %ls operation.
    • Collation conflict caused by collate clauses with different collation ‘%.*ls’ and ‘%.*ls’.
    • Cannot resolve collation conflict between "%ls" and "%ls" in %ls operator occurring in %ls statement column %d.
    • Implicit conversion of %ls value to %ls cannot be performed because the resulting collation is unresolved due to collation conflict between "%ls" and "%ls" in %ls operator.
    • Implicit conversion of %ls value to %ls cannot be performed because the collation of the value is unresolved due to a collation conflict between "%ls" and "%ls" in %ls operator.
    • Cannot resolve the collation conflict between "%.*ls" and "%.*ls" in the %ls operation.

    Hope this would help you in troubleshooting and fixing collation errors.

  • Cheers,
  • Balmukund Lakhani
  • Twitter @blakhani
  • Author: SQL Server 2012 AlwaysOnPaperback, Kindle

  • Troubleshooting : Error – Incorrect syntax near ‘GO’

    $
    0
    0

    Sometimes we are very comfortable and used to with certain things that if they change, we become nervous and uneasy. I have a lovely daughter and she always greets me when I come back home from office. That one “hello” takes away all my worries and I feel alive. Yesterday she didn’t do that and I was worried. I checked with my wife and she told that there was a mild fever and her mood is little different today. That made me little nervous.

    Same feeling happened when I saw below in management studio of my colleague:

    Msg 102, Level 15, State 1, Line 2

    Incorrect syntax near ‘GO’.

    image

     

    and management studio intellisense feature was also complaining about syntax. “Incorrect syntax near ‘End Of File’. Expecting ‘=’. As per documentation “GO is not a Transact-SQL statement; it is a command recognized by the sqlcmd and osql utilities and SQL Server Management Studio Code editor”

    I went to Tools > Option in Management studio and found that there was a customization done on batch separator. SSMS> Tools > Options >Query Execution > SQL Server > General > Batch Separator.

    Which means if I run with “come” it should work and as expected.

    image

    Perfect! This mystery is solved now. This made me think that can we customize SQLCMD as well. If we look into help of SQLCMD we can see parameter -c cmdend

    image

    Here is the usage of parameter –c . I have used hello as command end parameter and hello works same as go in true sense.

    image

    I hope this clears some confusion about batch separator.

    If you are getting this error while using ExecuteNonQuery in .net program and running script then refer http://blogs.msdn.com/b/onoj/archive/2008/02/26/incorrect-syntax-near-go-sqlcommand-executenonquery.aspx which has a workaround.

    Hope this helps!

  • Cheers,
  • Balmukund Lakhani
  • Twitter @blakhani
  • Author: SQL Server 2012 AlwaysOnPaperback, Kindle


  • Solution: Unable to launch SQL Server Configuration Manager – Invalid class [0x80041010]

    $
    0
    0

    While launching SQL Server Configuration Manager on one of my machine, I got below error

    image

    Here is the text of the error message:

    —————————
    SQL Server Configuration Manager
    —————————
    Cannot connect to WMI provider. You do not have permission or the server is unreachable. Note that you can only manage SQL Server 2005 and later servers with SQL Server Configuration Manager.
    Invalid class [0x80041010]
    —————————
    OK  
    —————————

    There might be various reason for this error. In this case our actual problem is “Invalid class” which I have highlighted above. I have looked further and found that below is the solution for me. Same solution also works for Invalid namespace [0x8004100e] error also. 

    image

    C:\WINDOWS\system32>mofcomp "C:\Program Files (x86)\Microsoft SQL Server\120\Shared\sqlmgmproviderxpsp2up.mof"
    Microsoft (R) MOF Compiler Version 6.3.9600.16384
    Copyright (c) Microsoft Corp. 1997-2006. All rights reserved.
    Parsing MOF file: C:\Program Files (x86)\Microsoft SQL Server\120\Shared\sqlmgmproviderxpsp2up.mof
    MOF file has been successfully parsed
    Storing data in the repository…
    Done!

    MofComp is a command line utility to compile MOF (Managed Object Format) files and store the data in WMI repository. The MOF Compiler is available in the %Windir%\System32\wbem directory. So if you are getting “‘mofcomp’ is not recognized as an internal or external command then try changing current directory from command prompt to %Windir%\System32\wbem

    Also note that in mof file on my machine is under 120 folder. Depends on SQL version installed, you may have it in different folder. The value of that number depends on the version of SQL Server.

    Microsoft SQL Server 2014

    120

    Microsoft SQL Server 2012

    110

    Microsoft SQL Server 2008 R2

    100

    Microsoft SQL Server 2008

    100

    Microsoft SQL Server 2005

    90

     

    Hope this helps.

  • Cheers,
  • Balmukund Lakhani
  • Twitter @blakhani
  • Author: SQL Server 2012 AlwaysOnPaperback, Kindle

  • Solution : Cannot add new node – Rule "SQL Server Database Services feature state" failed.

    $
    0
    0

    While deploying SQL Server 2014 cluster in my lab I ran into this problem when I was trying to add second node (Node 2) to SQL cluster.

    —————————
    Rule Check Result
    —————————
    Rule "SQL Server Database Services feature state" failed. The SQL Server Database Services feature failed when it was initially installed. The feature must be removed before the current scenario can proceed.
    —————————
    OK
    —————————

    Above error means that when I installed SQL Server on first cluster node, it didn’t install properly. There were some errors while creating cluster and due to that AddNode is blocked. In my case also this is true because I had an issue with SQL Server Network Name Resource and it didn’t come online earlier. Below was the original error on Node 1

    Cluster network name resource ‘SQL Network Name (Balmukund)’ cannot be brought online. The computer object associated with the resource could not be updated in domain ‘MyDomain.com’ for the following reason:
    Unable to update password for computer account.

    The text for the associated error code is: Access is denied.

    The cluster identity ‘WinCluster$’ may lack permissions required to update the object. Please work with your domain administrator to ensure that the cluster identity can update computer objects in the domain.

    I fixed above error by giving proper permission to cluster computer object on Balmukund (which was the network name of SQL) After fixing that I created SQL Server and SQL Agent manually. Sometime it may also happen that SQL Agent didn’t come online on Node 1 due to some failure and it was fixed after setup was completed.

    So, if we see above rule failure then we should go to the first node and check setup log files to see if the installation succeeded or failed. If the first node installation had some failures which were fixed later then we should do a repair of the installation. Repair option can be found under the "Maintenance" page in SQL Server Installation Center (setup).  This action will clear the failure state and allow the node addition action.

    If you are lazy like me to do repair then you can also check registry key
    HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL12.SQL2014\ConfigurationState

    If you see value as 2 for any component then there were some failures of those components. In most of the cases since issue would be with SQL Server or SQL Agent, we might see 2 for MPT_AGENT_CORE_CNI, SQL_Engine_Core_Inst, SQL_FullText_Adv, SQL_Replication_Core_Inst

    Note: that we MUST fix the error on first node before taking shortcut of registry value. Once error is fixed, we can make the value of those component to 1 (means success)

    The highlighted value MSSQL12.SQL2014 might vary based on SQL version and Instance name. The first piece for SQL 2008 it would be MSSQL10, for SQL 2008 R2 it would be MSSQL10_50, for SQL 2012 it would be MSSQL11 and for SQL 2014 it would be MSSQL12. Second piece is instance name (for me instance name is SQL2014)

    Hope this helps!

  • Cheers,
  • Balmukund Lakhani
  • Twitter @blakhani
  • Author: SQL Server 2012 AlwaysOnPaperback, Kindle

  • SQL SERVER – SSMS Database Expand Hang – High waits on PREEMPTIVE_OS_LOOKUPACCOUNTSID

    $
    0
    0

    Recently I have had a friend who reported below issues.

    1. When I expand database, it takes a lot of time.
    2. When I expand Jobs under SQL Server Agent node in SQL Server Management Studio, it freezes and finally it fails with error  
      “An exception occurred while executing a Transact-SQL statement or batch. (Microsoft.SqlServer.ConnectionInfo)
      Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding”

    The symptom on the server that we were working on was that from Management Studio, when trying to browse SQL agent job, the interface would hang. I have asked to capture Profiler trace to find out which query is taking time and what is the waits for those query which are stuck.

    In profiler trace, and the query to capture currently running queries (refer this blog) I found that it was running sp_help_job stored procedure from MSDB database. This is the procedure to get high level details about all jobs in MSDB database using msdb.dbo.sysjobs_view. When we looked further, we found that SQL Server is running function dbo.SQLAGENT_SUSER_SNAME and got stuck at SELECT @ret = SUSER_SNAME(@user_sid) statement. The wait for the session is was PREEMPTIVE_OS_LOOKUPACCOUNTSID and wait time was increasing. This wait is related to the communication/validation from Active Directory.

    When we debugged further, here is the chain of reaching to function.

    sp_help_job  
                 >> sp_get_composite_job_info 
                         >> Query having – owner = dbo.SQLAGENT_SUSER_SNAME(sjv.owner_sid) 
                                    >>  SELECT @ret = SUSER_SNAME(@user_sid)

    This is getting stuck at PREEMPTIVE_OS_LOOKUPACCOUNTSID. We was identified that the function is used to convert SIDs stored in SQL Server table to the name by making call to Active Directory. Now the challenge was to find why and also was to identify if its happening with particular logins or all login. The complexity here was that SQL Server stores SID in varbinary format not in the format which OS would understand.

    Luckily, I have had a blog post having script to convert the varbinary to well known format. So I have used that to convert SIDs obtained from below query

    select    owner_sid 
    from    msdb.dbo.sysjobs_view
    where    owner_sid <> 0x01

     

    Once we have SID value in OS understandable format, I used PsGetSID tool from sysinternals to get Windows account name. While running that it was taking a long time and finally it failed with below error.

    Error querying SID:

    The trust relationship between the primary domain and the trusted domain failed

    So, it was something to do with two domains trust which seems to be broken. I asked him to work with this Windows Domain Admin team and networking team to get the issue resolved.

    Same issue might happen during database expand also as, database owner is a SID stored in sys.databases and that has to be converted to name.

    Hope this helps.

  • Cheers,
  • Balmukund Lakhani
  • Twitter @blakhani

  • SQL SERVER – SSMS Database Expand Hang – High waits on PREEMPTIVE_OS_LOOKUPACCOUNTSID

    $
    0
    0

    Recently I have had a friend who reported below issues.

    1. When I expand database, it takes a lot of time.
    2. When I expand Jobs under SQL Server Agent node in SQL Server Management Studio, it freezes and finally it fails with error  
      “An exception occurred while executing a Transact-SQL statement or batch. (Microsoft.SqlServer.ConnectionInfo)
      Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding”

    The symptom on the server that we were working on was that from Management Studio, when trying to browse SQL agent job, the interface would hang. I have asked to capture Profiler trace to find out which query is taking time and what is the waits for those query which are stuck.

    In profiler trace, and the query to capture currently running queries (refer this blog) I found that it was running sp_help_job stored procedure from MSDB database. This is the procedure to get high level details about all jobs in MSDB database using msdb.dbo.sysjobs_view. When we looked further, we found that SQL Server is running function dbo.SQLAGENT_SUSER_SNAME and got stuck at SELECT @ret = SUSER_SNAME(@user_sid) statement. The wait for the session is was PREEMPTIVE_OS_LOOKUPACCOUNTSID and wait time was increasing. This wait is related to the communication/validation from Active Directory.

    When we debugged further, here is the chain of reaching to function.

    sp_help_job  
                 >> sp_get_composite_job_info 
                         >> Query having – owner = dbo.SQLAGENT_SUSER_SNAME(sjv.owner_sid) 
                                    >>  SELECT @ret = SUSER_SNAME(@user_sid)

    This is getting stuck at PREEMPTIVE_OS_LOOKUPACCOUNTSID. We was identified that the function is used to convert SIDs stored in SQL Server table to the name by making call to Active Directory. Now the challenge was to find why and also was to identify if its happening with particular logins or all login. The complexity here was that SQL Server stores SID in varbinary format not in the format which OS would understand.

    Luckily, I have had a blog post having script to convert the varbinary to well known format. So I have used that to convert SIDs obtained from below query

    select    owner_sid
    from    msdb.dbo.sysjobs_view
    where    owner_sid <> 0x01

     

    Once we have SID value in OS understandable format, I used PsGetSID tool from sysinternals to get Windows account name. While running that it was taking a long time and finally it failed with below error.

    Error querying SID:

    The trust relationship between the primary domain and the trusted domain failed

    So, it was something to do with two domains trust which seems to be broken. I asked him to work with this Windows Domain Admin team and networking team to get the issue resolved.

    Same issue might happen during database expand also as, database owner is a SID stored in sys.databases and that has to be converted to name.

    Hope this helps.

  • Cheers,
  • Balmukund Lakhani
  • Twitter @blakhani

  • Tips and Tricks: Useful Parameters of Get-ClusterLog

    $
    0
    0

    Working with Root Cause Analysis (RCA) is also part of my work at Microsoft. In case of cluster failover RCA, it is very important to get cluster log. Sometimes there are situation where we want to generate cluster log for last few minutes for quicker analysis of live issues. This blog explains some common parameters which I used in my day-to-day troubleshooting.

    I have 4 nodes cluster in my lab named SRV1, SRV2, SRV3, SRV4.

    • Default command – generates Cluster.log file on ALL nodes in C:\Windows\Cluster\Reports folder. File name would be Cluster.log

    Get-ClusterLog 

    • if we want the cluster log to be generated for specific node(s) then we can use –Node parameter. We can put comma separated node names as shown below.

    Get-ClusterLog -Node SRV1, SRV3

    • You might know that the time shown in cluster log is UTC be default. Sometimes its difficult to translate UTC time to local time, specially for time-zones which has daylight saving. Luckily, cluster log can be generated in local time using parameter UseLocalTime . Here is the sample code.

    Get-ClusterLog –UseLocalTime

    • Another useful parameter is to copy the files to specific location. This command would generate logs and also dump on specified location. in below example, I am dumping logs from all nodes to C:\Temp folder.

    Get-ClusterLog –Destination “C:\Temp”

    • TimeSpan is another parameter which can generate cluster log for last number of minutes specified. By default it would generate Cluster.log for complete time. I find it useful when I repro’ed a problem and I want to look at cluster log for last 2 to 3 minutes. Here is the command to generate log for last 3 minutes.

    Get-ClusterLog –TimeSpan 3

    So, this is my favorite command after reproducing cluster issue on local node.

    Get-ClusterLog -Node SRV1 -TimeSpan 2 -UseLocalTime -Destination C:\

    Hopefully it would be useful.

    Cheers,
    Balmukund


    How to fix SQL Patch Error: No valid sequence could be found for the set of updates.

    $
    0
    0

    DISCLAIMER : THIS CODE AND INFORMATION IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.

    Note: It is recommended to take backup of all the files before making any changes.

    Recently we worked on a support case where we were having trouble installing SQL Server Patch. This was SQL Server 2016 SP2 and we were trying to install KB4524334 CU10 for SQL Server 2016 SP2. There was failure while upgrading PolyBase feature. (This can happen with other features also so same can be followed to fix them as well). The screenshot was something similar.

    Let me start by explaining how do we troubleshoot installation/path failure issues. Whenever we troubleshoot SQL Server Setup/Patch related issue, we start looking at the log files to find exact error message which caused the failure. In case of patching, we first look at Summary.txt file which gives overall picture for all instances which were attempted to patch. Based on instance which had failure, we need to look at Summary file located under the instance name folder and dig further. Here is the article from online documentation:

    View and Read SQL Server Setup Log Files

    Lets go step by step in our scenario.

    1. We looked at Summary_<MachineName>_<DateTimeStamp>.txt which is located under the folder having date-time stamp. I have highlighted important information which you need to review.

    Overall summary:
      Final result:                  The patch installer has failed to update the following instance: MSSQLSERVER. To determine the reason for failure, review the log files.
      Exit code (Decimal):           -2068052368
      Start time:                    2019-11-26 11:57:40
      End time:                      2019-11-26 12:00:01
      Requested action:              Patch

    Instance MSSQLSERVER overall summary:
      Final result:                  The patch installer has failed to update the shared features. To determine the reason for failure, review the log files.
      Exit code (Decimal):           -2068052368
      Start time:                    2019-11-26 11:58:49
      End time:                      2019-11-26 11:59:51
      Requested action:              Patch

    2. We can see “Overall summary” and “Final result” which tells the instance which got failure, if any. In this case “MSSQLSERVER” which is a default instance. Below “Overall Summary” we can also see each instance overall summary.

    3. Our failure was for instance MSSQLSERVER hence we need to go inside MSSQLSERVER folder and then look at Summary_<MachineName>_<DateTimeStamp>.txt file there. The information in this file would tell exact component which failed. Here are the various sections in that file and their meaning.

    Overall summary:     
    <This section shows Action requested (like Install, Patch, Repair, EditionUpgarde etc.) and overall outcome of the setup. This would also tell if reboot is needed>

    Machine Properties:    
    <Various Details about the machine like name, operating system, language etc.>

    Product features discovered:   
    <This section would show details about SQL features which are already installed on this machine> 

    Package properties:    
    <This section would show information about the media which we are using to install/patch, its version, patch level etc.>

    User Input Settings:    
    <This section would show action requested, Install, Patch, EditionUpgarde etc.>

    Detailed results:     
    <This section would show information about each features being installed and whether it was Passed or Failed. In case of Failure, it would also so information about how to move forward>

     

    Other sections were not relevant in this troubleshooting so ignoring them.

    We know the sections and let us use the information available

    Overall summary:
      Final result:                  The patch installer has failed to update the shared features. To determine the reason for failure, review the log files.
      Exit code (Decimal):           -2068052368
      Start time:                    2019-11-26 11:58:49
      End time:                      2019-11-26 11:59:51
      Requested action:              Patch

    Setup completed with required actions for features.
    Troubleshooting information for those features:
      Next step for Polybase:        Use the following information to resolve the error, and then try the setup process again.

    Detailed results:
      Feature:                       PolyBase Query Service for External Data
      Status:                        Failed: see logs for details
      Reason for failure:            An error occurred during the setup process of the feature.
      Next Step:                     Use the following information to resolve the error, and then try the setup process again.
      Component name:                SQL PolyBase
      Component error code:          1648
      Component log file:            C:\Program Files\Microsoft SQL Server\130\Setup Bootstrap\Log\20191126_115725\MSSQLSERVER\sql_polybase_core_inst_Cpu64_1.log
      Error description:             No valid sequence could be found for the set of updates.

      Feature:                       R Services (In-Database)
      Status:                        Passed

      Feature:                       Database Engine Services
      Status:                        Passed

      Feature:                       Integration Services
      Status:                        Passed

      Feature:                       Analysis Services
      Status:                        Passed

      Feature:                       SQL Browser
      Status:                        Passed

      Feature:                       SQL Writer
      Status:                        Passed

      Feature:                       LocalDB
      Status:                        Passed

      Feature:                       Setup Support Files
      Status:                        Passed

    From above information, we can easily understand that Polybase component had failure and other components got installed (shown as Passed). The message also points us to log which should have more details. When I opened sql_polybase_core_inst_Cpu64_1.log . The last few lines before failures are shown below.

    MSI (s) (BC:90) [11:59:39:817]: Opening existing patch ‘C:\Windows\Installer\39187852.msp’.
    MSI (s) (BC:90) [11:59:39:818]: Note: 1: 2205 2:  3: MsiPatchSequence
    MSI (s) (BC:90) [11:59:39:818]: Opening existing patch C:\Windows\Installer\6419f1dc.msp’.
    MSI (s) (BC:90) [11:59:39:820]: Opening existing patch C:\Windows\Installer\10894c.msp.
    MSI (s) (BC:90) [11:59:39:838]: File will have security applied from OpCode.
    MSI (s) (BC:90) [11:59:42:428]: Original patch ==> G:\d43ad554dc7e18fa425c07eed0\x64\setup\sql_polybase_core_inst.msp
    MSI (s) (BC:90) [11:59:42:428]: Patch we’re running from ==> C:\Windows\Installer\3d61881.msp
    MSI (s) (BC:90) [11:59:42:429]: SOFTWARE RESTRICTION POLICY: Verifying patch –> ‘G:\d43ad554dc7e18fa425c07eed0\x64\setup\sql_polybase_core_inst.msp’ against software restriction policy
    MSI (s) (BC:90) [11:59:42:429]: SOFTWARE RESTRICTION POLICY: G:\d43ad554dc7e18fa425c07eed0\x64\setup\sql_polybase_core_inst.msp has a digital signature
    MSI (s) (BC:90) [11:59:44:784]: SOFTWARE RESTRICTION POLICY: G:\d43ad554dc7e18fa425c07eed0\x64\setup\sql_polybase_core_inst.msp is permitted to run at the ‘unrestricted’ authorization level.
    MSI (s) (BC:90) [11:59:44:784]: SequencePatches starts. Product code: {0877B0AD-CF2A-4079-BDC7-E1EE287F6B9A}, Product version: 13.0.1601.5, Upgrade code: {3577537A-A081-4B6F-8CA3-43315ED52B32}, Product language 1033
    MSI (s) (BC:90) [11:59:44:784]: Note: 1: 2205 2:  3: MsiPatchSequence
    MSI (s) (BC:90) [11:59:44:784]: Note: 1: 2203 2: RTM.1 3: -2147287038
    MSI (s) (BC:90) [11:59:44:784]: PATCH SEQUENCER ERROR: failed to open RTM.1 transform in {6A8DFAB4-23AC-4C79-B509-DF1CC39A58CD} patch! (1: 2203 2: RTM.1 3: -2147287038 )
    MSI (s) (BC:90) [11:59:44:784]: SequencePatches returns error 1648.
    MSI (s) (BC:90) [11:59:44:785]: Product: SQL Server 2016 SQL Polybase – Update ‘{F0B7DA4A-7ACB-424E-A44E-11F27090AABA}’ could not be installed. Error code 1648. Additional information is available in the log file C:\Program Files\Microsoft SQL Server\130\Setup Bootstrap\Log\20191126_115725\MSSQLSERVER\sql_polybase_core_inst_Cpu64_1.log.

    MSI (s) (BC:90) [11:59:44:786]: Windows Installer installed an update. Product Name: SQL Server 2016 SQL Polybase. Product Version: 13.0.1601.5. Product Language: 1033. Manufacturer: Microsoft Corporation. Update Name: {F0B7DA4A-7ACB-424E-A44E-11F27090AABA}. Installation success or error status: 1648.

    MSI (s) (BC:90) [11:59:44:786]: Note: 1: 1708
    MSI (s) (BC:90) [11:59:44:786]: Product: SQL Server 2016 SQL Polybase — Installation failed.

    MSI (s) (BC:90) [11:59:44:787]: Windows Installer installed the product. Product Name: SQL Server 2016 SQL Polybase. Product Version: 13.0.1601.5. Product Language: 1033. Manufacturer: Microsoft Corporation. Installation success or error status: 1648.

    Now the solution of the issue. If you look closely at the log file, it is referring few msp files, highlighted in blue color (39187852.msp, 6419f1dc.msp, 10894c.msp)  . These files are cached version of sql_polybase_core_inst.msp files in C:\Windows\Installer folder. These files are for various previous patches applied on PolyBase feature. How would you find which MSP belongs to which patch i.e. which service pack, which CU, which KB? Fortunately there is a detailed KB available for this.

    How to restore the missing Windows Installer cache files and resolve problems that occur during a SQL Server update

    In the article go with “Procedure 1.b.: Use the FindSQLInstalls.vbs script” and run VBScript given. Once you run the script, open the output file and search for the msp files which are listed (highlighted by blue) in previous snippet. Here is are the relevant sections from VBScript output file. (Search for 39187852.msp)

    Display Name:    Service Pack 2 for SQL Polybase (64-bit) (KB4052908)
    KB Article URL: 
    http://support.microsoft.com/?kbid=4052908
    Install Date:    20190129
      Uninstallable:   1
    Patch Details:
      HKEY_CLASSES_ROOT\Installer\Patches\4BAFD8A6CA3297C45B90FDC13CA985DC
      PackageName:   sql_polybase_core_inst.msp
       Patch LastUsedSource: n;1;g:\c1d6c477712a2972547d\x64\setup\
      Installer Cache File Path:     C:\Windows\Installer\39187852.msp
        Per SOFTWARE\Microsoft\Windows\CurrentVersion\Installer\UserData\S-1-5-18\Patches\4BAFD8A6CA3297C45B90FDC13CA985DC\LocalPackage

       Package exists in the Installer cache, no actions needed.
       Package will update automatically if needed assuming that
       the LastUsedSource exists.

       Should you get errors about C:\Windows\Installer\39187852.msp or g:\c1d6c477712a2972547d\x64\setup\sql_polybase_core_inst.msp then you
       may need to manually copy missing files, if file exists replace the problem file,
       Copy and paste the following command line into an administrative command prompt.

        Copy "g:\c1d6c477712a2972547d\x64\setup\sql_polybase_core_inst.msp" C:\Windows\Installer\39187852.msp

    It clearly tells this MSP is from “Service Pack 2 for SQL Polybase (64-bit) (KB4052908)”. Lets have look at one more msp output. (6419f1dc.msp)

    Display Name:    Hotfix 5337 for SQL Polybase (64-bit) (KB4495256)
    KB Article URL: 
    http://support.microsoft.com/?kbid=4495256
    Install Date:    20190605
      Uninstallable:   1
    Patch Details:
      HKEY_CLASSES_ROOT\Installer\Patches\8EE0896B83D2A1246AD903269C85495A
      PackageName:   sql_polybase_core_inst.msp
       Patch LastUsedSource: n;1;G:\5645cedfa7be15cb95b0aedb1d6d05\x64\setup\
      Installer Cache File Path:     C:\Windows\Installer\6419f1dc.msp
        Per SOFTWARE\Microsoft\Windows\CurrentVersion\Installer\UserData\S-1-5-18\Patches\8EE0896B83D2A1246AD903269C85495A\LocalPackage

       Package exists in the Installer cache, no actions needed.
       Package will update automatically if needed assuming that
       the LastUsedSource exists.

       Should you get errors about C:\Windows\Installer\6419f1dc.msp or G:\5645cedfa7be15cb95b0aedb1d6d05\x64\setup\sql_polybase_core_inst.msp then you
       may need to manually copy missing files, if file exists replace the problem file,
       Copy and paste the following command line into an administrative command prompt.

        Copy "G:\5645cedfa7be15cb95b0aedb1d6d05\x64\setup\sql_polybase_core_inst.msp" C:\Windows\Installer\6419f1dc.msp

    It should be easy for you to identify that this one is from Hotfix 5337 for SQL Polybase (64-bit) (KB4495256)

    Have a look at the file in C:\Windows\Installer folder and make a note of the size (39187852.msp, 6419f1dc.msp, 10894c.msp). Next download the KB which is mentioned in the “Display Name”. Once downloaded, extract in the path mentioned in last line, the copy command. Following above two examples, KB4052908 needs to be extracted in g:\c1d6c477712a2972547d and KB4495256 should be extracted in G:\5645cedfa7be15cb95b0aedb1d6d05. Note that this location is NOT fixed and you need to use the one in your script output. Now compare the files mentioned in last line (same copy command). Mostly you would find that at least one of the file would not be correct and that is the problem.

    To fix the issue, we need to make sure that MSP file in C:\Windows\Installer is same as the one in the media.

    Here is sql_polybase_core_inst_Cpu64_1.log file from another patch failure for SQL Server 2017. This one shows only one MSP instead of three which we have seen in previous logs. It all depends on which all patches have been installed previously. I would like you at analyze below file and then move further.

    MSI (s) (D4:B4) [14:43:47:265]: Opening existing patch ‘C:\windows\Installer\5640be.msp‘.
    MSI (s) (D4:B4) [14:43:47:265]: Note: 1: 2205 2:  3: MsiPatchSequence
    MSI (s) (D4:B4) [14:43:47:265]: Note: 1: 1402 2: HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Policies\Explorer 3: 2
    MSI (s) (D4:B4) [14:43:47:265]: File will have security applied from OpCode.
    MSI (s) (D4:B4) [14:43:47:812]: Original patch ==> f:\05c89661f21d4215ddc41e36f60a\x64\setup\sql_polybase_core_inst.msp
    MSI (s) (D4:B4) [14:43:47:812]: Patch we’re running from ==> C:\windows\Installer\611c45.msp
    MSI (s) (D4:B4) [14:43:47:812]: SOFTWARE RESTRICTION POLICY: Verifying patch –> ‘f:\05c89661f21d4215ddc41e36f60a\x64\setup\sql_polybase_core_inst.msp’ against software restriction policy
    MSI (s) (D4:B4) [14:43:47:812]: SOFTWARE RESTRICTION POLICY: f:\05c89661f21d4215ddc41e36f60a\x64\setup\sql_polybase_core_inst.msp has a digital signature
    MSI (s) (D4:B4) [14:43:49:062]: SOFTWARE RESTRICTION POLICY: f:\05c89661f21d4215ddc41e36f60a\x64\setup\sql_polybase_core_inst.msp is permitted to run at the ‘unrestricted’ authorization level.
    MSI (s) (D4:B4) [14:43:49:062]: SequencePatches starts. Product code: {41E37C92-1A68-4909-95A1-D9D59B0C0548}, Product version: 14.0.1000.169, Upgrade code: {3CCB3D90-D0AD-40F6-9C81-836CFE259EEE}, Product language 1033
    MSI (s) (D4:B4) [14:43:49:062]: Note: 1: 2205 2:  3: MsiPatchSequence
    MSI (s) (D4:B4) [14:43:49:062]: Note: 1: 2203 2: RTM.1 3: -2147287038
    MSI (s) (D4:B4) [14:43:49:062]: PATCH SEQUENCER ERROR: failed to open RTM.1 transform in {76ACE427-ED4D-480E-BA64-8E09038A428B} patch! (1: 2203 2: RTM.1 3: -2147287038 )
    MSI (s) (D4:B4) [14:43:49:062]: SequencePatches returns error 1648.
    MSI (s) (D4:B4) [14:43:49:077]: Product: SQL Server 2017 SQL Polybase – Update ‘{533B2368-AC10-4078-A3AA-DA9F1B33E9FE}’ could not be installed. Error code 1648. Additional information is available in the log file C:\Program Files\Microsoft SQL Server\140\Setup Bootstrap\Log\20191129_143647\MSSQLSERVER\sql_polybase_core_inst_Cpu64_1.log.

    MSI (s) (D4:B4) [14:43:49:077]: Windows Installer installed an update. Product Name: SQL Server 2017 SQL Polybase. Product Version: 14.0.1000.169. Product Language: 1033. Manufacturer: Microsoft Corporation. Update Name: {533B2368-AC10-4078-A3AA-DA9F1B33E9FE}. Installation success or error status: 1648.

    MSI (s) (D4:B4) [14:43:49:077]: Note: 1: 1708
    MSI (s) (D4:B4) [14:43:49:077]: Product: SQL Server 2017 SQL Polybase — Installation failed.

    MSI (s) (D4:B4) [14:43:49:077]: Windows Installer installed the product. Product Name: SQL Server 2017 SQL Polybase. Product Version: 14.0.1000.169. Product Language: 1033. Manufacturer: Microsoft Corporation. Installation success or error status: 1648.

    We were able to use VBScript to get the MSP from patch media and replaced it in C:\Windows\Installer. After that we ran patch again and it went fine.

    As I have mentioned earlier, you need to be little careful while playing with the files. Always take a backup – Better safe than sorry!

  • Cheers,
  • Balmukund Lakhani
  • Twitter @blakhani
  • Author: SQL Server 2012 AlwaysOnPaperback, Kindle
  • Viewing all 14 articles
    Browse latest View live