Friday, March 30, 2012

Invisible rows after failed syncronization

Publisher: SQL Server 2000 SP4
Subscriber: SQL Server 2005 Mobile Edition

Sometimes, after a failed merge replication(due to communication error) some rows on the subscriber became invisible for the publisher.

I've reproduced a case:

[START Short Version]
All rows inserted between a comm failed sync and a fine sync became invisible.
This seems to be caused because the subscriber keeps generating rows with the same generation number and the publisher doesn't look for this generation anymore.
[END Short Version]

[START Detailed case]
STEP 0: INITIAL STATE
Publisher
Orders Table
EMPTY

Subscriber
NO DATABASE

STEP 1: After FIRST SYNC and INSERTING 2 Orders

Publisher

Orders Table

EMPTY

Subscriber

Orders Table
OrderId __sysIG __sysCG __sysMC
000001 4 4 81
000002 4 4 81

STEP 2: After SECOND SYNC

Publisher

Orders Table

OrderId

000001

000002

Subscriber

Orders Table

OrderId __sysIG __sysCG __sysMC

000001 4 4 80

000002 4 4 80

STEP 3: INSERT ANOTHER ORDER

Subscriber

Orders Table

OrderId __sysIG __sysCG __sysMC

000001 4 4 80

000002 4 4 80

000003 6 6 81

STEP 4: After THIRD SYNC (with comm error)
Error: [NativeError:28037][HRESULT:-2147012889]->[A request to send data to the computer running IIS has failed. For more information, see HRESULT.]
When: Error happens provoked after sending from Subscriber to Publisher but before ending syn process.
Publisher
Orders Table

OrderId

000001

000002

000003
Subscriber
Orders Table

OrderId __sysIG __sysCG __sysMC

000001 4 4 80

000002 4 4 80
000003 6 6 81
Note: Publisher has the row but subscriber keeps the state in 81.

STEP 5: INSERT ANOTHER ORDER (BEFORE A GOOD SYNC)
Subscriber
Orders Table

OrderId __sysIG __sysCG __sysMC

000001 4 4 80

000002 4 4 80
000003 6 6 81

000004 6 6 81
Note: Orders 000003 and 000004 have the same system info

STEP 6: After FOURTH SYNC
Publisher
Orders Table

OrderId
000001

000002

000003
Subscriber
Orders Table

OrderId __sysIG __sysCG __sysMC

000001 4 4 80

000002 4 4 80
000003 6 6 81

000004 6 6 81

Note: Even with a fine merge process Publisher didn't get the 000004 row and the Subscriber didn't update the __sysMC

STEP 7: INSERT ANOTHER ORDER

Subscriber

Orders Table
OrderId __sysIG __sysCG __sysMC
000001 4 4 80
000002 4 4 80

000003 6 6 81

000004 6 6 81

000005 8 8 81

STEP 8: After FIFTH SYNC

Publisher
Orders Table
OrderId
000001
000002
000003
000005
Subscriber
Orders Table
OrderId __sysIG __sysCG __sysMC
000001 4 4 80
000002 4 4 80
000003 6 6 81

000004 6 6 81

000002 8 8 80

Note: Data from generation 8 merge correctly, but data from generation 6 keeps invisible to publisher.
STEP 9: DUMMY UPDATE ON GEN 6 ROWS
Action: I made a non relevant update to force the Subscriber update the generation.
Subscriber
Orders Table

OrderId __sysIG __sysCG __sysMC

000001 4 4 80

000002 4 4 80
000003 6 10 81
000004 6 10 81
000005 8 8 80

Note: sysCG get its value updated correctly.

STEP 10: After SIXTH SYNC

Publisher
Orders Table
OrderId
000001
000002
000003
000004

000005
Subscriber
Orders Table
OrderId __sysIG __sysCG __sysMC
000001 4 4 80
000002 4 4 80

000003 6 6 80

000004 6 6 80


000002 8 8 80

Note: Data merges correctly.
[END Detailed case]

I could code a system that checks all the publication tables and updates all the last generation rows in case of communication error but i would really like to avoid doing it.

I don't know if I'm missing something or if this is a bug or a known issue.

This is a big problem for me because communication errors on cellular phone based connection are quite common and users keep working even if there's an error.

Any comment will be appreciated.Finally, I've modified my replication process solve this manually.

On a normal case (no previous error):
1. I get the current generation (__sysMergeSubscriptions.LastUploadedGen+1)
2. Launch a normal sync.
3. If it fails I save the failed generation for using it on next sync.

On the previous error case:
1. Do a normal sync (to get a valid __sysMergeSubscriptions.LastUploadedGen)
2. Do a dummy update on all rows (from all published tables (from __sysMergeArticles)) that __sysIG or __sysCG equals the wrong generation number.
3. Do a normal sync to send all the invisible rows to the publisher.

I don't like my own solution, but that's the only way I have to keep the system working as I haven't find any reference to this problem anywhere.|||

Hey MeZKaL,

I just have few questions to narrow downt he problem point in the whole path of sync components.

In the same above scenario, can you please also let me know the contents of __sysMergeSubscription and
__sysMergeSubscriptionProperties table for that subscription in each of the above STEPs (esp. Steps 4 to 6)?

Basically, I just want to know which of the following is true (after a failed sync)

1) Client did not enumerate the changes and hence did not send the changes

2) Client has enumerated the changes, sent the changes but metadata does not say its new change

3) Client has enumerated, sent the changes, metadata also says its a new change, but publisher ignores it for some internal
reasons like publisher metadata says its a false change ...etc

Thanks,

Laxmi

|||I repeated the case saving a copy of the mobile db for each step.

There are no changes on __sysMergeSubscriptionProperties so I will put it just one time.

__sysMergeSubscriptionProperties
SubscriptionId: 87a0e646-d908-b0b9-e2b1-d6eb58165c8f
Distributor
DistributorAddress
DistributorNetwork: 0
DistributorSecurityMode: 1
DistributorLogin
DistributorPassword
ExchangeType: 3
InternetLogin: domain\user
InternetURL: URL
InternetPassword: *******
LoginTimeout: 15
ProfileName: DEFAULT
Publisher: SERVER\INSTANCE
PublisherAddress
PublisherNetwork: 0
PublisherDatabase: DBName
PublisherSecurityMode: 1
PublisherLogin: domain\user
PublisherPassword: *******
Publication: PubName
QueryTimeout: 300
Validate: 0
HostName
Subscriber: domain\user
InternetProxyServer
InternetProxyLogin
InternetProxyPassword
ConnectionRetryTimeout: 120
CompressionLevel: 1
ConnectionManager: 0
SnapshotTransferType: 99

Many of the field value from __sysMergeSubscriptions didn't change during the process.

__sysMergeSubscriptions
Publisher: server\instance
PublisherDatabase: DBName
Publication: PubName
PublicationId: 03e497f6-83b6-4de3-87e3-e97835880c94
SubscriptionId: 87a0e646-d908-b0b9-e2b1-d6eb58165c8f
SyncPartnerId: 751a4dfe-2eba-7a4a-b11a-35beec9a8bf3
SyncAnchorId: 00000000-0000-0000-0000-000000000000
SchemaVersion: 481
SchemaVersionId: 00000000-0000-0000-0000-000000000000
ReplNick (casted to int): -312916250
Retention: 0
MachineId: 374677b5-b002-d9f5-d4e4-4628538f2ccd
PathName: Path\DB.sdf
ReinitRequested: 0
ReplProp: NULL
HostName:
ValidationHostName: 0
ReplicaState: 00000000-0000-0000-0000-000000000000
LastCleanupDate: 15/11/2006 16:30
ReconcilerVersion: 8
Subscriber: domain\user

[START Detailed case]

STEP 1: After FIRST SYNC
Publisher
Orders Table
EMPTY

Subscriber
Orders Table
EMPTY

__sysMergeSubscriptions
SyncAnchor LastUploadedGen LastSuccessfulSync SentGen80
4977 0 15/11/2006 16:30 70
SentGen80Id: 61327c39-f34b-15d1-3924-8e2577f94d6f

STEP 2: After INSERTING 2 Orders
Subscriber
Orders Table
OrderId __sysIG __sysCG __sysMC
57000 4 4 81
57001 4 4 81

__sysMergeSubscriptions
SyncAnchor LastUploadedGen LastSuccessfulSync SentGen80
4977 0 15/11/2006 16:30 70
SentGen80Id: 61327c39-f34b-15d1-3924-8e2577f94d6f

STEP 3: After SECOND SYNC (with comm error)
Error: [NativeError:28037][HRESULT:-2147012889]->[A request to send data to the computer running IIS has failed. For more information, see HRESULT.]
When: Error happens provoked after sending from Subscriber to Publisher but before ending syn process.
Publisher
Orders Table
OrderId
57000
57001

Subscriber
Orders Table
OrderId __sysIG __sysCG __sysMC
57000 4 4 81
57001 4 4 81

__sysMergeSubscriptions
SyncAnchor LastUploadedGen LastSuccessfulSync SentGen80
4977 0 15/11/2006 16:30 70
SentGen80Id: 61327c39-f34b-15d1-3924-8e2577f94d6f

STEP 4: INSERT ANOTHER ORDER
Subscriber
Orders Table
OrderId __sysIG __sysCG __sysMC
57000 4 4 81
57001 4 4 81
57002 4 4 81

__sysMergeSubscriptions
SyncAnchor LastUploadedGen LastSuccessfulSync SentGen80
4977 0 15/11/2006 16:30 70
SentGen80Id: 61327c39-f34b-15d1-3924-8e2577f94d6f

STEP 5: After THIRD SYNC
Publisher
Orders Table
OrderId
57000
57001

Subscriber
Orders Table
OrderId __sysIG __sysCG __sysMC
57000 4 4 81
57001 4 4 81
57002 4 4 81

__sysMergeSubscriptions
SyncAnchor LastUploadedGen LastSuccessfulSync SentGen80
5095 5 15/11/2006 17:22 71
SentGen80Id: e5ff3df6-195c-3145-273e-86784ace7e16

STEP 6: DUMMY UPDATE
Subscriber
Orders Table
OrderId __sysIG __sysCG __sysMC
57000 4 6 81
57001 4 6 81
57002 4 6 81

__sysMergeSubscriptions
SyncAnchor LastUploadedGen LastSuccessfulSync SentGen80
5095 5 15/11/2006 17:22 71
SentGen80Id: e5ff3df6-195c-3145-273e-86784ace7e16

STEP 7: After FOURTH SYNC
Publisher
Orders Table
OrderId
57000
57001
57002

Subscriber
Orders Table
OrderId __sysIG __sysCG __sysMC
57000 4 6 80
57001 4 6 80
57002 4 6 80

__sysMergeSubscriptions
SyncAnchor LastUploadedGen LastSuccessfulSync SentGen80
5213 7 15/11/2006 17:35 72
SentGen80Id
SentGen80Id: 4171a971-b963-a541-300a-0b1602902e39

[__sysMergeSubscriptions DETAIL]
SyncAnchor LastUploadedGen LastSuccessfulSync SentGen80
step1 4977 0 15/11/2006 16:30 70
step2 4977 0 15/11/2006 16:30 70
step3 4977 0 15/11/2006 16:30 70
step4 4977 0 15/11/2006 16:30 70
step5 5095 5 15/11/2006 17:22 71
step6 5095 5 15/11/2006 17:22 71
step7 5213 7 15/11/2006 17:35 72

SentGen80Id
61327c39-f34b-15d1-3924-8e2577f94d6f
61327c39-f34b-15d1-3924-8e2577f94d6f
61327c39-f34b-15d1-3924-8e2577f94d6f
61327c39-f34b-15d1-3924-8e2577f94d6f
e5ff3df6-195c-3145-273e-86784ace7e16
e5ff3df6-195c-3145-273e-86784ace7e16
4171a971-b963-a541-300a-0b1602902e39|||Can you open a case with CSS please for this? That way we can get a better handle on this in a timely manner.|||My workaround runs pretty well and I don't have time now for spending hours and hours crossing mails and test results with a support service.

If anyone can help with the previous data, it'll be welcomed but I won't spent more time testing and logging for this issue. I've lots of work and not many time.

No comments:

Post a Comment