Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data race of reading row value of timestamp data type #179

Open
7phs opened this issue Nov 20, 2023 · 3 comments
Open

Data race of reading row value of timestamp data type #179

7phs opened this issue Nov 20, 2023 · 3 comments
Labels
bug Something isn't working

Comments

@7phs
Copy link

7phs commented Nov 20, 2023

Hello,

I found a data race with reading data from Databricks simultaneously. For example, from different tables or reading catalogs metadata simultaneously.

A code to reproduce data race is

import (
	"context"
	"database/sql"
	"fmt"
	"sync"
	"testing"

	dbsql "github.com/databricks/databricks-sql-go"
)

func TestDatabricksDataRace(t *testing.T) {
	var (
		wg         sync.WaitGroup
		ctx        = context.Background()
		w          = make(chan bool)
		routineNum = 10
	)

	wg.Add(routineNum)

	for i := 0; i < routineNum; i++ {
		go func() {
			defer wg.Done()

			<-w

			if err := listCatalogs(ctx); err != nil {
				fmt.Println("ERROR:", err)
			}
		}()
	}

	close(w)

	wg.Wait()
}

func listCatalogs(ctx context.Context) error {
	connector, err := dbsql.NewConnector(
		dbsql.WithServerHostname(cfg.Host),
		dbsql.WithPort(int(cfg.Port)),
		dbsql.WithAccessToken(cfg.AccessToken),
		dbsql.WithHTTPPath(cfg.HTTPPath),
	)
	if err != nil {
		return err
	}

	db := sql.OpenDB(connector)
	defer db.Close()

	for i := 0; i < 10; i++ {
		err := func() error {
			r, err := db.QueryContext(ctx, "SHOW CATALOGS;")
			if err != nil {
				return err
			}
			defer r.Close()

			for r.Next() {
				var s string
				if err := r.Scan(&s); err != nil {
					return err
				}
			}

			return nil
		}()

		if err != nil {
			return err
		}
	}

	return nil
}

A command to run this test with data race detector:

go test -race -run TestDatabricksDataRace .

A root cause of data race is not initialised field loc of arrow.TimestampType. It initialised in the first call of function arrow.TimestampType :: GetZone().

A workaround of data race is:

func init() {
   // init `arrow.TimestampType` before use it.
    _, _ = arrow.FixedWidthTypes.Timestamp_us.(*arrow.TimestampType).GetToTimeFunc()
}

Environment:

  • go v1.21
  • databricks-sql-go v1.5.2
@7phs
Copy link
Author

7phs commented Nov 20, 2023

Related issue of Apache Arrow - apache/arrow#38795

@kravets-levko kravets-levko added the bug Something isn't working label Apr 17, 2024
@7phs
Copy link
Author

7phs commented May 29, 2024

databricks-sql-go updated and uses Apache Arrow Go v16 with fixed data race.

This bug is fixed now.

@7phs 7phs closed this as completed May 29, 2024
@7phs
Copy link
Author

7phs commented Jun 5, 2024

A bug of a data race still exists with reverting the Apache Arrow version to v12.

@7phs 7phs reopened this Jun 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants