Skip to content

cx_oracle requires method to force use of NCHAR and setinputsizes for strings #4163

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
sqlalchemy-bot opened this issue Jan 15, 2018 · 15 comments

Comments

@sqlalchemy-bot
Copy link
Collaborator

Migrated issue, originally created by Michael Bayer (@zzzeek)

thread at:

https://groups.google.com/d/msg/sqlalchemy/vsJBLakBP4Y/PVz5zSKCAAAJ

Anthony's info for a similar issue at:

oracle/python-cx_Oracle#119

potential APIs:

e = create_engine("oracle://...", use_nchar_for_unicode=True)

or:

from sqlalchemy.dialects.oracle import NVARCHAR2

...   Column('data', NVARCHAR2(50, use_nchar=True))

A more open-ended approach, since we occasionally have issues with setinputsizes being needed or not:

e = create_engine("oracle://...")
e.dialect.setinputsizes(Unicode, cx_Oracle.NCHAR)

the third approach could also accommodate for user-defined types. it would need some changes in DefaultDialect.set_input_sizes as well.

it's not that easy to let users just override get_dbapi_type() because that is only called upon the "impl" type that is mostly dialect-private.

the former is way easier to use but I don't know if people are going to want this applied to all datatypes across the board.

@sqlalchemy-bot
Copy link
Collaborator Author

Stefan Schwarzer (@sschwarzer) wrote:

About the second API:

If I understand the documentation at

https://docs.oracle.com/en/database/oracle/oracle-database/12.2/nlspg/supporting-multilingual-databases-with-unicode.html#GUID-95787DA3-08D4-41A7-9418-2A36C602014E

correctly, an Oracle NVARCHAR2 implies a cx_Oracle.NCHAR type. If you do an SQL SELECT on an NVARCHAR2 column in cx_Oracle, the cursor description contains the cx_Oracle.NCHAR datatype for the NVARCHAR2 column.

So a use_nchar parameter for sqlalchemy.dialects.oracle.NVARCHAR2 would be redundant, wouldn't it be?

Generally, as I understand the matter so far, Unicode in SQLAlchemy and N... string data types are orthogonal. For example, you can have a VARCHAR2 (not NVARCHAR2) column in the database which probably should be a unicode string if the database encoding NLS_CHARACTERSET is the recommended AL32UTF8.

So according to my current understanding, there are three different concepts (where the last two are always correlated for cx_Oracle):

  • The user wants the data, as it's visible on the Python side, to be unicode strings (only relevant for Python 2). I think this is what the use of SQLAlchemy's Unicode type means. Also, for a Unicode column, SQLAlchemy will use NVARCHAR2, not VARCHAR2, when creating a table with this column.

  • The datatype of the column is one of the N... character types, so it uses Oracle's NLS_NCHAR_CHARACTERSET, not the database character set NLS_CHARACTERSET.

  • For an N... character type, cx_Oracle uses its NCHAR type, see https://cx-oracle.readthedocs.io/en/latest/module.html#cx_Oracle.NCHAR , which mentions only NVARCHAR2. I haven't verified if NCHAR is also used for NCLOB.

@sqlalchemy-bot
Copy link
Collaborator Author

Michael Bayer (@zzzeek) wrote:

Setting NCHAR across the board for all NVARCHAR2 would certainly be easiest. I've asked Anthony on the github issue.

@sqlalchemy-bot
Copy link
Collaborator Author

Michael Bayer (@zzzeek) wrote:

here's a review at https://gerrit.sqlalchemy.org/#/c/zzzeek/sqlalchemy/+/633 that will test this new setting.

@sqlalchemy-bot
Copy link
Collaborator Author

Michael Bayer (@zzzeek) wrote:

also if you could try that patch locally and see if it fixes your issue.

@sqlalchemy-bot
Copy link
Collaborator Author

Stefan Schwarzer (@sschwarzer) wrote:

Apart from NVARCHAR2, there are two other national character datatypes for Oracle databases, NCHAR (fixed length) and NCLOB (several GB). See https://docs.oracle.com/cd/B19306_01/server.102/b14200/sql_elements001.htm .

For testing the cx_Oracle types for these columns, I created a table

CREATE TABLE NCHAR_TEST (
  NVARCHAR2_COL NVARCHAR2(10),
  NCHAR_COL NCHAR(10),
  NCLOB_COL NCLOB
)

and ran SELECT * FROM NCHAR_TEST directly from cx_Oracle. The cx_Oracle datatypes for the columns, taken from Cursor.description, are:

  • NVARCHAR2 -> cx_Oracle.NCHAR (already known)
  • NCHAR -> cx_Oracle.FIXED_NCHAR
  • NCLOB -> cx_Oracle.NCLOB

Do any of these need to be considered (in addition to NVARCHAR2)? Please let me know if I should test the current patch or a later one that takes more datatypes into account.

@sqlalchemy-bot
Copy link
Collaborator Author

Michael Bayer (@zzzeek) wrote:

these have already been included in setinputsizes, I only added NCHAR:

self._include_setinputsizes = {
cx_Oracle.NCLOB, cx_Oracle.CLOB, cx_Oracle.LOB,
cx_Oracle.BLOB, cx_Oracle.FIXED_CHAR, cx_Oracle.TIMESTAMP
}

important thing is patch solves your problem, right? it passes CI so is good on my end.

@sqlalchemy-bot
Copy link
Collaborator Author

Michael Bayer (@zzzeek) wrote:

oh FIXED_NCHAR vs. FIXED_CHAR, can add that sure

@sqlalchemy-bot
Copy link
Collaborator Author

Michael Bayer (@zzzeek) wrote:

this is added to the review

@sqlalchemy-bot
Copy link
Collaborator Author

Michael Bayer (@zzzeek) wrote:

Use NCHAR + setinputsizes() for all NVARCHAR2

The cx_Oracle dialect now calls setinputsizes() with cx_Oracle.NCHAR
unconditionally when the NVARCHAR2 datatype, in SQLAlchemy corresponding
to sqltypes.Unicode(), is in use. Per cx_Oracle's author this allows
the correct conversions to occur within the Oracle client regardless
of the setting for NLS_NCHAR_CHARACTERSET.

Change-Id: I3989b7aaf2178c263015a7433939196b76baf1e4
Fixes: #4163

ff4898f

@sqlalchemy-bot
Copy link
Collaborator Author

Changes by Michael Bayer (@zzzeek):

  • changed status to closed

@sqlalchemy-bot
Copy link
Collaborator Author

Stefan Schwarzer (@sschwarzer) wrote:

Your fix works for me.

(Sorry, I wasn't able to try it earlier.)

Thank you! :-)

@sqlalchemy-bot
Copy link
Collaborator Author

Michael Bayer (@zzzeek) wrote:

OK you have a workaroudn for now until I release 1.2.2 ?

@sqlalchemy-bot
Copy link
Collaborator Author

Stefan Schwarzer (@sschwarzer) wrote:

Yes, no problem. :)

@sqlalchemy-bot
Copy link
Collaborator Author

Michael Bayer (@zzzeek) wrote:

this issue is going to have to be re-done entirely, see the discussion at https://bitbucket.org/zzzeek/sqlalchemy/issues/4242/nchar-cant-be-bound-for-every-string-value#comment-44855224

@sqlalchemy-bot
Copy link
Collaborator Author

Michael Bayer (@zzzeek) wrote:

we will bind NCHAR when the NVARCHAR2 etc. datatypes are used, but if generic Unicode is passed, it will only bind to NCHAR if use_nchar_for_unicode=True is passed (e.g. my first idea above).

The change however de-emphasizes NCHAR at the DDL level as well so I'm going to keep it in 1.3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant