Encoding Text

Encoding Text#

Consider the following examples of movie reviews:

Example 1

  • Review 1A (negative)
    I thought the movie would be great, but it is not. The script is weak, the pacing is slow, and the ending feels pointless.

  • Review 1B (positive)
    I thought the movie would not be great, but it is. The script is strong, the pacing is brisk, and the ending feels meaningful.

Example 2

  • Review 2A (negative)
    The acting is decent, but the plot is predictable and the jokes fall flat.

  • Review 2B (positive)
    The plot is predictable and the jokes fall flat, but the acting is decent.

Example 3

  • Review 3A (negative)
    For the first ninety minutes I waited for something exciting to happen. Spoiler: it never does.

  • Review 3B (positive)
    For the first ninety minutes I waited for something exciting to happen, and when it finally does it is worth every second.

A problem with a bag of words representation treats each review as the same multiset of words: $\( vector=(acting:1, plot:1, predictable:1, but:1, \ldots). \)$

Because position is discarded, the classifier cannot learn that “not” negates what follows or that the clause after “but” usually carries the main sentiment.

import numpy as np
from tensorflow.keras.layers import Embedding

# 1. Toy vocabulary: 5 tokens
#    0 = <PAD>, 1 = "good", 2 = "bad", 3 = "not", 4 = "movie"
vocab_size   = 5
embed_dim    = 3   # just three numbers per word
sequence_len = 4

embed = Embedding(input_dim=vocab_size,
                  output_dim=embed_dim,
                  input_length=sequence_len)

# 2. Example review, integer‑encoded and padded:
#    "not good movie"   →   [3, 1, 4, 0]
sample = np.array([[3, 1, 4, 0]])        # shape (1, 4)

dense_seq = embed(sample)                # shape (1, 4, 3)
print(dense_seq.numpy().round(2))
2025-05-08 15:06:44.682838: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2025-05-08 15:06:44.686025: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2025-05-08 15:06:44.694503: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
E0000 00:00:1746716804.708111   74993 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1746716804.712204   74993 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1746716804.723628   74993 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1746716804.723638   74993 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1746716804.723640   74993 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1746716804.723641   74993 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
2025-05-08 15:06:44.727530: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
[[[ 0.02  0.03 -0.02]
  [-0.05  0.04 -0.03]
  [ 0.04  0.03  0.05]
  [ 0.   -0.01  0.03]]]
/opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/layers/core/embedding.py:90: UserWarning: Argument `input_length` is deprecated. Just remove it.
  warnings.warn(
2025-05-08 15:06:47.093900: E external/local_xla/xla/stream_executor/cuda/cuda_platform.cc:51] failed call to cuInit: INTERNAL: CUDA error: Failed call to cuInit: UNKNOWN ERROR (303)

Word Embeddings#

Unlike bag‑of‑words, use dense vectors that are trainable and will gradually move so that “good” and “bad” point in different directions, while “not” can flip the meaning when a sequential layer (CNN, RNN) reads the tokens in order.

import numpy as np
from tensorflow.keras.layers import Embedding

# 1. Toy vocabulary: 5 tokens
#    0 = <PAD>, 1 = "good", 2 = "bad", 3 = "not", 4 = "movie"
vocab_size   = 6
embed_dim    = 2   # just three numbers per word
sequence_len = 4

embed = Embedding(input_dim=vocab_size,
                  output_dim=embed_dim,
                  input_length=sequence_len)

# 2. Example review, integer‑encoded and padded:
#    "not good movie"   →   [3, 1, 4, 0]
sample = np.array([[3, 1, 5, 3]])        # shape (1, 4)

dense_seq = embed(sample)                # shape (1, 4, 3)
print(dense_seq.numpy().round(2))
[[[ 0.02  0.  ]
  [-0.03  0.01]
  [ 0.05  0.02]
  [ 0.02  0.  ]]]
import tensorflow as tf

Using a CNN#

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

# ------------------------------------------------------------------
# 1. Data loading and preprocessing
# ------------------------------------------------------------------
max_features = 20_000      # keep the 20 000 most frequent words
maxlen        = 400        # cut / pad every review to 400 tokens

(x_train, y_train), (x_test, y_test) = keras.datasets.imdb.load_data(
    num_words=max_features
)

x_train = keras.preprocessing.sequence.pad_sequences(x_train, maxlen=maxlen)
x_test  = keras.preprocessing.sequence.pad_sequences(x_test,  maxlen=maxlen)

# ------------------------------------------------------------------
# 2. CNN model
# ------------------------------------------------------------------
model = keras.Sequential([
    layers.Embedding(max_features, 128, input_length=maxlen),

    layers.Conv1D(64, 7, activation="relu"),
    layers.MaxPooling1D(3),

    layers.Conv1D(64, 7, activation="relu"),
    layers.GlobalMaxPooling1D(),

    layers.Dense(1, activation="sigmoid")
])

model.compile(
    optimizer="adam",
    loss="binary_crossentropy",
    metrics=["accuracy"]
)

model.summary()

# ------------------------------------------------------------------
# 3. Training
# ------------------------------------------------------------------
history = model.fit(
    x_train,
    y_train,
    epochs=8,
    batch_size=128,
    validation_split=0.2,
    verbose=0
)

# ------------------------------------------------------------------
# 4. Evaluation
# ------------------------------------------------------------------
test_loss, test_acc = model.evaluate(x_test, y_test, verbose=0)
print(f"Test accuracy: {test_acc:.3f}")

# ------------------------------------------------------------------
# 5. Predictions (first ten test examples, rounded for readability)
# ------------------------------------------------------------------
preds = model.predict(x_test[:10]).round(3).squeeze()
print("Predicted probabilities:", preds)
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                     Output Shape                  Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ embedding_2 (Embedding)         │ ?                      │   0 (unbuilt) │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv1d (Conv1D)                 │ ?                      │   0 (unbuilt) │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling1d (MaxPooling1D)    │ ?                      │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv1d_1 (Conv1D)               │ ?                      │   0 (unbuilt) │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ global_max_pooling1d            │ ?                      │             0 │
│ (GlobalMaxPooling1D)            │                        │               │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense (Dense)                   │ ?                      │   0 (unbuilt) │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 0 (0.00 B)
 Trainable params: 0 (0.00 B)
 Non-trainable params: 0 (0.00 B)
---------------------------------------------------------------------------
KeyboardInterrupt                         Traceback (most recent call last)
Cell In[4], line 44
     39 model.summary()
     41 # ------------------------------------------------------------------
     42 # 3. Training
     43 # ------------------------------------------------------------------
---> 44 history = model.fit(
     45     x_train,
     46     y_train,
     47     epochs=8,
     48     batch_size=128,
     49     validation_split=0.2,
     50     verbose=0
     51 )
     53 # ------------------------------------------------------------------
     54 # 4. Evaluation
     55 # ------------------------------------------------------------------
     56 test_loss, test_acc = model.evaluate(x_test, y_test, verbose=0)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/utils/traceback_utils.py:117, in filter_traceback.<locals>.error_handler(*args, **kwargs)
    115 filtered_tb = None
    116 try:
--> 117     return fn(*args, **kwargs)
    118 except Exception as e:
    119     filtered_tb = _process_traceback_frames(e.__traceback__)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/backend/tensorflow/trainer.py:395, in TensorFlowTrainer.fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_batch_size, validation_freq)
    384 if getattr(self, "_eval_epoch_iterator", None) is None:
    385     self._eval_epoch_iterator = TFEpochIterator(
    386         x=val_x,
    387         y=val_y,
   (...)    393         shuffle=False,
    394     )
--> 395 val_logs = self.evaluate(
    396     x=val_x,
    397     y=val_y,
    398     sample_weight=val_sample_weight,
    399     batch_size=validation_batch_size or batch_size,
    400     steps=validation_steps,
    401     callbacks=callbacks,
    402     return_dict=True,
    403     _use_cached_eval_dataset=True,
    404 )
    405 val_logs = {
    406     "val_" + name: val for name, val in val_logs.items()
    407 }
    408 epoch_logs.update(val_logs)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/utils/traceback_utils.py:117, in filter_traceback.<locals>.error_handler(*args, **kwargs)
    115 filtered_tb = None
    116 try:
--> 117     return fn(*args, **kwargs)
    118 except Exception as e:
    119     filtered_tb = _process_traceback_frames(e.__traceback__)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/backend/tensorflow/trainer.py:483, in TensorFlowTrainer.evaluate(self, x, y, batch_size, verbose, sample_weight, steps, callbacks, return_dict, **kwargs)
    481 for step, iterator in epoch_iterator:
    482     callbacks.on_test_batch_begin(step)
--> 483     logs = self.test_function(iterator)
    484     callbacks.on_test_batch_end(step, logs)
    485     if self.stop_evaluating:

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/backend/tensorflow/trainer.py:219, in TensorFlowTrainer._make_function.<locals>.function(iterator)
    215 def function(iterator):
    216     if isinstance(
    217         iterator, (tf.data.Iterator, tf.distribute.DistributedIterator)
    218     ):
--> 219         opt_outputs = multi_step_on_iterator(iterator)
    220         if not opt_outputs.has_value():
    221             raise StopIteration

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/util/traceback_utils.py:150, in filter_traceback.<locals>.error_handler(*args, **kwargs)
    148 filtered_tb = None
    149 try:
--> 150   return fn(*args, **kwargs)
    151 except Exception as e:
    152   filtered_tb = _process_traceback_frames(e.__traceback__)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py:833, in Function.__call__(self, *args, **kwds)
    830 compiler = "xla" if self._jit_compile else "nonXla"
    832 with OptionalXlaContext(self._jit_compile):
--> 833   result = self._call(*args, **kwds)
    835 new_tracing_count = self.experimental_get_tracing_count()
    836 without_tracing = (tracing_count == new_tracing_count)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py:889, in Function._call(self, *args, **kwds)
    886 try:
    887   # This is the first call of __call__, so we have to initialize.
    888   initializers = []
--> 889   self._initialize(args, kwds, add_initializers_to=initializers)
    890 finally:
    891   # At this point we know that the initialization is complete (or less
    892   # interestingly an exception was raised) so we no longer need a lock.
    893   self._lock.release()

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py:696, in Function._initialize(self, args, kwds, add_initializers_to)
    691 self._variable_creation_config = self._generate_scoped_tracing_options(
    692     variable_capturing_scope,
    693     tracing_compilation.ScopeType.VARIABLE_CREATION,
    694 )
    695 # Force the definition of the function for these arguments
--> 696 self._concrete_variable_creation_fn = tracing_compilation.trace_function(
    697     args, kwds, self._variable_creation_config
    698 )
    700 def invalid_creator_scope(*unused_args, **unused_kwds):
    701   """Disables variable creation."""

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/tracing_compilation.py:178, in trace_function(args, kwargs, tracing_options)
    175     args = tracing_options.input_signature
    176     kwargs = {}
--> 178   concrete_function = _maybe_define_function(
    179       args, kwargs, tracing_options
    180   )
    182 if not tracing_options.bind_graph_to_function:
    183   concrete_function._garbage_collector.release()  # pylint: disable=protected-access

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/tracing_compilation.py:283, in _maybe_define_function(args, kwargs, tracing_options)
    281 else:
    282   target_func_type = lookup_func_type
--> 283 concrete_function = _create_concrete_function(
    284     target_func_type, lookup_func_context, func_graph, tracing_options
    285 )
    287 if tracing_options.function_cache is not None:
    288   tracing_options.function_cache.add(
    289       concrete_function, current_func_context
    290   )

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/tracing_compilation.py:310, in _create_concrete_function(function_type, type_context, func_graph, tracing_options)
    303   placeholder_bound_args = function_type.placeholder_arguments(
    304       placeholder_context
    305   )
    307 disable_acd = tracing_options.attributes and tracing_options.attributes.get(
    308     attributes_lib.DISABLE_ACD, False
    309 )
--> 310 traced_func_graph = func_graph_module.func_graph_from_py_func(
    311     tracing_options.name,
    312     tracing_options.python_function,
    313     placeholder_bound_args.args,
    314     placeholder_bound_args.kwargs,
    315     None,
    316     func_graph=func_graph,
    317     add_control_dependencies=not disable_acd,
    318     arg_names=function_type_utils.to_arg_names(function_type),
    319     create_placeholders=False,
    320 )
    322 transform.apply_func_graph_transforms(traced_func_graph)
    324 graph_capture_container = traced_func_graph.function_captures

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/framework/func_graph.py:1060, in func_graph_from_py_func(name, python_func, args, kwargs, signature, func_graph, add_control_dependencies, arg_names, op_return_value, collections, capture_by_value, create_placeholders)
   1057   return x
   1059 _, original_func = tf_decorator.unwrap(python_func)
-> 1060 func_outputs = python_func(*func_args, **func_kwargs)
   1062 # invariant: `func_outputs` contains only Tensors, CompositeTensors,
   1063 # TensorArrays and `None`s.
   1064 func_outputs = variable_utils.convert_variables_to_tensors(func_outputs)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py:599, in Function._generate_scoped_tracing_options.<locals>.wrapped_fn(*args, **kwds)
    595 with default_graph._variable_creator_scope(scope, priority=50):  # pylint: disable=protected-access
    596   # __wrapped__ allows AutoGraph to swap in a converted function. We give
    597   # the function a weak reference to itself to avoid a reference cycle.
    598   with OptionalXlaContext(compile_with_xla):
--> 599     out = weak_wrapped_fn().__wrapped__(*args, **kwds)
    600   return out

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/autograph_util.py:41, in py_func_from_autograph.<locals>.autograph_handler(*args, **kwargs)
     39 """Calls a converted version of original_func."""
     40 try:
---> 41   return api.converted_call(
     42       original_func,
     43       args,
     44       kwargs,
     45       options=converter.ConversionOptions(
     46           recursive=True,
     47           optional_features=autograph_options,
     48           user_requested=True,
     49       ))
     50 except Exception as e:  # pylint:disable=broad-except
     51   if hasattr(e, "ag_error_metadata"):

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/autograph/impl/api.py:339, in converted_call(f, args, kwargs, caller_fn_scope, options)
    337 if is_autograph_artifact(f):
    338   logging.log(2, 'Permanently allowed: %s: AutoGraph artifact', f)
--> 339   return _call_unconverted(f, args, kwargs, options)
    341 # If this is a partial, unwrap it and redo all the checks.
    342 if isinstance(f, functools.partial):

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/autograph/impl/api.py:459, in _call_unconverted(f, args, kwargs, options, update_cache)
    456   return f.__self__.call(args, kwargs)
    458 if kwargs is not None:
--> 459   return f(*args, **kwargs)
    460 return f(*args)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/autograph/impl/api.py:643, in do_not_convert.<locals>.wrapper(*args, **kwargs)
    641 def wrapper(*args, **kwargs):
    642   with ag_ctx.ControlStatusCtx(status=ag_ctx.Status.DISABLED):
--> 643     return func(*args, **kwargs)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/backend/tensorflow/trainer.py:132, in TensorFlowTrainer._make_function.<locals>.multi_step_on_iterator(iterator)
    128 @tf.autograph.experimental.do_not_convert
    129 def multi_step_on_iterator(iterator):
    130     if self.steps_per_execution == 1:
    131         return tf.experimental.Optional.from_value(
--> 132             one_step_on_data(iterator.get_next())
    133         )
    135     # the spec is set lazily during the tracing of `tf.while_loop`
    136     empty_outputs = tf.experimental.Optional.empty(None)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/util/traceback_utils.py:150, in filter_traceback.<locals>.error_handler(*args, **kwargs)
    148 filtered_tb = None
    149 try:
--> 150   return fn(*args, **kwargs)
    151 except Exception as e:
    152   filtered_tb = _process_traceback_frames(e.__traceback__)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py:833, in Function.__call__(self, *args, **kwds)
    830 compiler = "xla" if self._jit_compile else "nonXla"
    832 with OptionalXlaContext(self._jit_compile):
--> 833   result = self._call(*args, **kwds)
    835 new_tracing_count = self.experimental_get_tracing_count()
    836 without_tracing = (tracing_count == new_tracing_count)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py:889, in Function._call(self, *args, **kwds)
    886 try:
    887   # This is the first call of __call__, so we have to initialize.
    888   initializers = []
--> 889   self._initialize(args, kwds, add_initializers_to=initializers)
    890 finally:
    891   # At this point we know that the initialization is complete (or less
    892   # interestingly an exception was raised) so we no longer need a lock.
    893   self._lock.release()

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py:696, in Function._initialize(self, args, kwds, add_initializers_to)
    691 self._variable_creation_config = self._generate_scoped_tracing_options(
    692     variable_capturing_scope,
    693     tracing_compilation.ScopeType.VARIABLE_CREATION,
    694 )
    695 # Force the definition of the function for these arguments
--> 696 self._concrete_variable_creation_fn = tracing_compilation.trace_function(
    697     args, kwds, self._variable_creation_config
    698 )
    700 def invalid_creator_scope(*unused_args, **unused_kwds):
    701   """Disables variable creation."""

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/tracing_compilation.py:178, in trace_function(args, kwargs, tracing_options)
    175     args = tracing_options.input_signature
    176     kwargs = {}
--> 178   concrete_function = _maybe_define_function(
    179       args, kwargs, tracing_options
    180   )
    182 if not tracing_options.bind_graph_to_function:
    183   concrete_function._garbage_collector.release()  # pylint: disable=protected-access

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/tracing_compilation.py:283, in _maybe_define_function(args, kwargs, tracing_options)
    281 else:
    282   target_func_type = lookup_func_type
--> 283 concrete_function = _create_concrete_function(
    284     target_func_type, lookup_func_context, func_graph, tracing_options
    285 )
    287 if tracing_options.function_cache is not None:
    288   tracing_options.function_cache.add(
    289       concrete_function, current_func_context
    290   )

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/tracing_compilation.py:310, in _create_concrete_function(function_type, type_context, func_graph, tracing_options)
    303   placeholder_bound_args = function_type.placeholder_arguments(
    304       placeholder_context
    305   )
    307 disable_acd = tracing_options.attributes and tracing_options.attributes.get(
    308     attributes_lib.DISABLE_ACD, False
    309 )
--> 310 traced_func_graph = func_graph_module.func_graph_from_py_func(
    311     tracing_options.name,
    312     tracing_options.python_function,
    313     placeholder_bound_args.args,
    314     placeholder_bound_args.kwargs,
    315     None,
    316     func_graph=func_graph,
    317     add_control_dependencies=not disable_acd,
    318     arg_names=function_type_utils.to_arg_names(function_type),
    319     create_placeholders=False,
    320 )
    322 transform.apply_func_graph_transforms(traced_func_graph)
    324 graph_capture_container = traced_func_graph.function_captures

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/framework/func_graph.py:1060, in func_graph_from_py_func(name, python_func, args, kwargs, signature, func_graph, add_control_dependencies, arg_names, op_return_value, collections, capture_by_value, create_placeholders)
   1057   return x
   1059 _, original_func = tf_decorator.unwrap(python_func)
-> 1060 func_outputs = python_func(*func_args, **func_kwargs)
   1062 # invariant: `func_outputs` contains only Tensors, CompositeTensors,
   1063 # TensorArrays and `None`s.
   1064 func_outputs = variable_utils.convert_variables_to_tensors(func_outputs)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py:599, in Function._generate_scoped_tracing_options.<locals>.wrapped_fn(*args, **kwds)
    595 with default_graph._variable_creator_scope(scope, priority=50):  # pylint: disable=protected-access
    596   # __wrapped__ allows AutoGraph to swap in a converted function. We give
    597   # the function a weak reference to itself to avoid a reference cycle.
    598   with OptionalXlaContext(compile_with_xla):
--> 599     out = weak_wrapped_fn().__wrapped__(*args, **kwds)
    600   return out

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/autograph_util.py:41, in py_func_from_autograph.<locals>.autograph_handler(*args, **kwargs)
     39 """Calls a converted version of original_func."""
     40 try:
---> 41   return api.converted_call(
     42       original_func,
     43       args,
     44       kwargs,
     45       options=converter.ConversionOptions(
     46           recursive=True,
     47           optional_features=autograph_options,
     48           user_requested=True,
     49       ))
     50 except Exception as e:  # pylint:disable=broad-except
     51   if hasattr(e, "ag_error_metadata"):

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/autograph/impl/api.py:339, in converted_call(f, args, kwargs, caller_fn_scope, options)
    337 if is_autograph_artifact(f):
    338   logging.log(2, 'Permanently allowed: %s: AutoGraph artifact', f)
--> 339   return _call_unconverted(f, args, kwargs, options)
    341 # If this is a partial, unwrap it and redo all the checks.
    342 if isinstance(f, functools.partial):

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/autograph/impl/api.py:459, in _call_unconverted(f, args, kwargs, options, update_cache)
    456   return f.__self__.call(args, kwargs)
    458 if kwargs is not None:
--> 459   return f(*args, **kwargs)
    460 return f(*args)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/autograph/impl/api.py:643, in do_not_convert.<locals>.wrapper(*args, **kwargs)
    641 def wrapper(*args, **kwargs):
    642   with ag_ctx.ControlStatusCtx(status=ag_ctx.Status.DISABLED):
--> 643     return func(*args, **kwargs)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/backend/tensorflow/trainer.py:113, in TensorFlowTrainer._make_function.<locals>.one_step_on_data(data)
    110 @tf.autograph.experimental.do_not_convert
    111 def one_step_on_data(data):
    112     """Runs a single training step on a batch of data."""
--> 113     outputs = self.distribute_strategy.run(step_function, args=(data,))
    114     outputs = reduce_per_replica(
    115         outputs,
    116         self.distribute_strategy,
    117         reduction="auto",
    118     )
    119     return outputs

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/distribute/distribute_lib.py:1673, in StrategyBase.run(***failed resolving arguments***)
   1668 with self.scope():
   1669   # tf.distribute supports Eager functions, so AutoGraph should not be
   1670   # applied when the caller is also in Eager mode.
   1671   fn = autograph.tf_convert(
   1672       fn, autograph_ctx.control_status_ctx(), convert_by_default=False)
-> 1673   return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/distribute/distribute_lib.py:3263, in StrategyExtendedV1.call_for_each_replica(self, fn, args, kwargs)
   3261   kwargs = {}
   3262 with self._container_strategy().scope():
-> 3263   return self._call_for_each_replica(fn, args, kwargs)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/distribute/distribute_lib.py:4061, in _DefaultDistributionExtended._call_for_each_replica(self, fn, args, kwargs)
   4059 def _call_for_each_replica(self, fn, args, kwargs):
   4060   with ReplicaContext(self._container_strategy(), replica_id_in_sync_group=0):
-> 4061     return fn(*args, **kwargs)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/autograph/impl/api.py:643, in do_not_convert.<locals>.wrapper(*args, **kwargs)
    641 def wrapper(*args, **kwargs):
    642   with ag_ctx.ControlStatusCtx(status=ag_ctx.Status.DISABLED):
--> 643     return func(*args, **kwargs)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/backend/tensorflow/trainer.py:92, in TensorFlowTrainer.test_step(self, data)
     90 else:
     91     y_pred = self(x)
---> 92 loss = self._compute_loss(
     93     x=x, y=y, y_pred=y_pred, sample_weight=sample_weight, training=False
     94 )
     95 self._loss_tracker.update_state(
     96     loss_module.unscale_loss_for_distribution(loss),
     97     sample_weight=tf.shape(tree.flatten(x)[0])[0],
     98 )
     99 return self.compute_metrics(x, y, y_pred, sample_weight=sample_weight)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/trainers/trainer.py:383, in Trainer._compute_loss(self, x, y, y_pred, sample_weight, training)
    376 """Backwards compatibility wrapper for `compute_loss`.
    377 
    378 This should be used instead `compute_loss` within `train_step` and
    379 `test_step` to support overrides of `compute_loss` that may not have
    380 the `training` argument, as this argument was added in Keras 3.3.
    381 """
    382 if self._compute_loss_has_training_arg:
--> 383     return self.compute_loss(
    384         x, y, y_pred, sample_weight, training=training
    385     )
    386 else:
    387     return self.compute_loss(x, y, y_pred, sample_weight)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/trainers/trainer.py:351, in Trainer.compute_loss(***failed resolving arguments***)
    349 losses = []
    350 if self._compile_loss is not None:
--> 351     loss = self._compile_loss(y, y_pred, sample_weight)
    352     if loss is not None:
    353         losses.append(loss)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/trainers/compile_utils.py:690, in CompileLoss.__call__(self, y_true, y_pred, sample_weight)
    688 def __call__(self, y_true, y_pred, sample_weight=None):
    689     with ops.name_scope(self.name):
--> 690         return self.call(y_true, y_pred, sample_weight)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/trainers/compile_utils.py:699, in CompileLoss.call(self, y_true, y_pred, sample_weight)
    696     self.build(y_true, y_pred)
    697 _, loss_fn, loss_weight, _ = self._flat_losses[0]
    698 loss_value = ops.cast(
--> 699     loss_fn(y_true, y_pred, sample_weight), dtype=self.dtype
    700 )
    701 if loss_weight is not None:
    702     loss_value = ops.multiply(loss_value, loss_weight)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/losses/loss.py:79, in Loss.__call__(self, y_true, y_pred, sample_weight)
     76 else:
     77     mask = None
---> 79 return reduce_weighted_values(
     80     losses,
     81     sample_weight=sample_weight,
     82     mask=mask,
     83     reduction=self.reduction,
     84     dtype=self.dtype,
     85 )

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/losses/loss.py:193, in reduce_weighted_values(values, sample_weight, mask, reduction, dtype)
    190     values = values * sample_weight
    192 # Apply reduction function to the individual weighted losses.
--> 193 loss = reduce_values(values, sample_weight, reduction)
    194 return loss

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/losses/loss.py:153, in reduce_values(values, sample_weight, reduction)
    151     divisor = ops.cast(ops.sum(sample_weight), loss.dtype)
    152 else:
--> 153     divisor = ops.cast(
    154         ops.prod(
    155             ops.convert_to_tensor(ops.shape(values), dtype="int32")
    156         ),
    157         loss.dtype,
    158     )
    159 loss = ops.divide_no_nan(loss, divisor)
    160 loss = scale_loss_for_distribution(loss)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/ops/core.py:803, in cast(x, dtype)
    801 if any_symbolic_tensors((x,)):
    802     return Cast(dtype=dtype)(x)
--> 803 return backend.core.cast(x, dtype)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/backend/tensorflow/core.py:204, in cast(x, dtype)
    202     return x
    203 else:
--> 204     return tf.cast(x, dtype=dtype)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/util/traceback_utils.py:150, in filter_traceback.<locals>.error_handler(*args, **kwargs)
    148 filtered_tb = None
    149 try:
--> 150   return fn(*args, **kwargs)
    151 except Exception as e:
    152   filtered_tb = _process_traceback_frames(e.__traceback__)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/util/dispatch.py:1260, in add_dispatch_support.<locals>.decorator.<locals>.op_dispatch_handler(*args, **kwargs)
   1258 # Fallback dispatch system (dispatch v1):
   1259 try:
-> 1260   return dispatch_target(*args, **kwargs)
   1261 except (TypeError, ValueError):
   1262   # Note: convert_to_eager_tensor currently raises a ValueError, not a
   1263   # TypeError, when given unexpected types.  So we need to catch both.
   1264   result = dispatch(op_dispatch_handler, args, kwargs)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/ops/math_ops.py:1021, in cast(x, dtype, name)
   1014     logging.warn(
   1015         f"You are casting an input of type {x.dtype.name} to an "
   1016         f"incompatible dtype {base_type.name}.  This will "
   1017         "discard the imaginary part and may not be what you "
   1018         "intended."
   1019     )
   1020   if x.dtype != base_type:
-> 1021     x = gen_math_ops.cast(x, base_type, name=name)
   1022 return x

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/ops/gen_math_ops.py:2130, in cast(x, DstT, Truncate, name)
   2128   Truncate = False
   2129 Truncate = _execute.make_bool(Truncate, "Truncate")
-> 2130 _, _, _op, _outputs = _op_def_library._apply_op_helper(
   2131       "Cast", x=x, DstT=DstT, Truncate=Truncate, name=name)
   2132 _result = _outputs[:]
   2133 if _execute.must_record_gradient():

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/framework/op_def_library.py:755, in _apply_op_helper(op_type_name, name, **keywords)
    752 def _apply_op_helper(op_type_name, name=None, **keywords):  # pylint: disable=invalid-name
    753   """Implementation of apply_op that returns output_structure, op."""
--> 755   op_def, g, producer = _GetOpDef(op_type_name, keywords)
    756   name = name if name else op_type_name
    758   attrs, attr_protos = {}, {}

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/framework/op_def_library.py:735, in _GetOpDef(op_type_name, keywords)
    731 try:
    732   # Need to flatten all the arguments into a list.
    733   # pylint: disable=protected-access
    734   g = ops._get_graph_from_inputs(_Flatten(keywords.values()))
--> 735   producer = g.graph_def_versions.producer
    736   # pylint: enable=protected-access
    737 except AssertionError as e:

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/framework/ops.py:2271, in Graph.graph_def_versions(self)
   2268   if self._finalized:
   2269     raise RuntimeError("Graph is finalized and cannot be modified.")
-> 2271 @property
   2272 def graph_def_versions(self) -> versions_pb2.VersionDef:
   2273   # pylint: disable=line-too-long
   2274   """The GraphDef version information of this graph.
   2275 
   2276   For details on the meaning of each version, see
   (...)   2280     A `VersionDef`.
   2281   """
   2282   return versions_pb2.VersionDef.FromString(self._version_def)

KeyboardInterrupt: 

A CNN is great at spotting local phrases like “not good,” but it quickly forgets where it found them; sentiment often hinges on relations far apart in the text, so we need a model that can keep track of information across the whole sequence.

  • A Conv1D with kernel size \(k=7\) can only look at \(7\) consecutive tokens at a time.

  • Stacking layers enlarges the receptive field only linearly (first layer sees 7 tokens, two layers see \(\approx 19\), etc.).

  • Important cues in a review are often dozens of tokens apart (“I hoped it would be great…but it is not.”).

On the IMDB dataset, a CNN typically reaches \(~0.88\) accuracy, identical to the bag‑of‑words MLP, then plateaus. Extra filters or layers add parameters but do not solve the fundamental distance problem.