Friday, August 3, 2018

Big Query External Table Usuage




Usage:-

  1. Read the data  from the file on Google storage  without loading it into Big query by treating the file as temporary table.
  2. We can query the data of file  and also perform DML operations using this data of file.

Examples:-

  1. Query the file data:-

bq --location=US query --external_table_definition=ORDER_DEL::INTEGRATION_ID:STRING@CSV=gs://cloud-dw-poc.appspot.com/PREPARED_LAYER/results-20180801-181812.csv 'SELECT INTEGRATION_ID FROM ORDER_DEL;'



  1. Update the  Main table using the data of File.

bq --location=US query   --use_legacy_sql=false --external_table_definition=ORDER_DEL::INTEGRATION_ID:STRING@CSV=gs://cloud-dw-poc.appspot.com/PREPARED_LAYER/results-20180801-181812.csv 'UPDATE  `cloud-dw-poc.Z_NOT_USED.ORDER_DEL_NEW1` TGT SET   DELETE_FLG ="Y" FROM ORDER_DEL SRC  WHERE  TGT.INTEGRATION_ID =SRC.INTEGRATION_ID;'

  1. Append to   Main table using the data of File.

bq query --append_table=true --batch=false --use_legacy_sql=false --destination_table=my-second-project-196504:DML_Operations.CRIME_DL --external_table_definition=TABLE::unique_key:INTEGER@CSV=gs://my-second-project-196504/CRIME.csv 'SELECT unique_key FROM TABLE;'




Tuesday, July 31, 2018

PWX-14607 Internal error: Invalid Serialization/mutex handle passed to function



There was space issue in the unix box. Once we freed the space, did not get this error again.

Friday, July 27, 2018

Informatica Metadata Query For Parameter File

SELECT DISTINCT
      B.SUBJECT_AREA,
      B.TASK_NAME AS SES_WF_NAME,
      A.ATTR_VALUE AS PRM_FILE_PATH
FROM
       REP_TASK_ATTR A,
       REP_ALL_TASKS B
WHERE
        A.ATTR_ID IN (1,4)
        AND A.TASK_ID = B.TASK_ID
        AND A.ATTR_VALUE IS NOT NULL
ORDER BY 1,2 ASC

-------------------------------------------------------------------------------------------------

SELECT
     *
 FROM
     (
         SELECT
             attr_id,
             b.subject_area,
             b.task_name    AS ses_wf_name,
             a.attr_value   AS prm_file_path,
             ROW_NUMBER() OVER(
                 PARTITION BY a.task_id
                 ORDER BY
                     a.version_number DESC
             ) AS rn
         FROM
             rep_task_attr a,
             rep_all_tasks b
         WHERE
             a.attr_id IN (
                 1,
                 4
             )
             AND a.task_id = b.task_id
             AND a.attr_value IS NOT NULL
             AND b.subject_area = 'RAW_DATA'
     )
 WHERE
     rn = 1